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^ OEITUABY FOR DENNIS SUTLER FBY» 



The death of Profe5s$r Peonls B. Fry at the age of y5 on March 21, 1983 
was a great blow to Nis colleagues, his many good friends, his wife 
Chry^tobel, and his three children. ^ ^ 

l)ennls Fry was bom the third of November 1907 In Stockbrldge, Hampshire, 
England. After five years of teaching French» first at Tewkesbury Gramnar 
School and then at Kllbum Grammar School, in 193^ he was at)polnted, Assistant 
Lecturer In. Phonetlces at University College London, where he also became 
Superintendent of the Phonetics Laboratory In 1937. In 1938 he* was promoted 
to Lecturer In Experflbental Phonetics'. In 19^8* the year after the award of 
his Ph.D. degree, he became Reader In Experimental Phonetics. From 1958 until 
his retirement In 1975, he was* professor of Experimental Phonetics, the first 
one to hold the title In Britain. ^ ^ 

* 

The Department of Phonetics of University College London owed much to 
Dennis Fry's benevolent yet, I think, firm headship from 19**9 to l97l. In- 
deed, he played an icnportant role In the later absorption of the newly fledged 
program in lingulstlc^O to form the present Department of Phonetics ^nd 
linguistics. ' * 

>* 

Dennis Fry's p^ionetlc Interests were broad and Included such topics as 
automatic speech recognition, the perception of lexical stress, children 's 
acquisition of pitionology, categorical perception, and the relevance of experi- 
mental phonetics for linguistics. He also did much Important work on problems 
of the the deaf, especially deaf children; furthermore, he worked on problems 
<?f hearing in aviation during his wartime service (19**1-**5) with the aooustlcs 
laboratory of the'RAF Central Medical Establishment. His extensive publlca- 
tlond up to the beginning of 1979 are listed in '*Essays on the Production and 
Perception of Speech in Honour of Dennis B. Fry,** a special issue of Language 
and Speech (Vol. 21, Part 1978). 

' From 1961, the tiW of the Fourth International Congress of Phonetic Sci- 
ences in Helsinki, until M» death. Fry served diligently as the President of 
the Permanent Council for the Organization of Internationar Congresses of 
Phonetic Sciences. I'kiiow .that our many exclsllent congresses regularly gave 
him pleasure over the successful ctrtcomes^of the negotiations that the^Coun* 
cll, under his leadership, has been able to carry out with dedicated scholars 
and scientists in so many places. In his last year he.^gan talking to some 
of us about encouraging able people in untried parts oT the world to mount^ 
equally good congresses. 

Fry al50 furthered internatlbnal cooperation in our field thwugh' his 
link of more than tt;enty-flve years with Hasklns Laboratories, first in New 
York City and then in New Havc^v Kift occasional lengthy visits to do research 
and his frequent consultations with some of yielded important results, on 
bgth sides of the Atlantic Ocean. 



*Also to appear ^in Speech Cociminlcation . 



CHASKINS laboratories: Status Report on Speech Research SR-76 (198^")] 



IjT 1958 Dennis Fry founded the journal Language qn d Speech aa an impor- 
tant outlet for broadly interdisciplinary work. He was Editor until 7975 when 
he' persuaded me to join him as Co-€dltor; he left the editorship altogether 
at the end of 1978^ three yl&rs aftw his retlre*aent from his professorship. 

His talent as a slngf^r, a talent much enjoyed by operatic groups in his 
region, went with a serious technical Interest In music and the singing voice. 
A very recent exacnple of hla publications In that field" Is tils Article "The 
Singer and the Auditorium*" in the 1930 volume of the Journal of S ound Vl6ra- > 
tfon . 

'^^ I should like to end with a personal note. From 3960 on, Dennis Fry's 
.h|imane'and good-humored approach to people and problems gave me a role modejl 
that I fear I shall never match. The sudden loss- of his warm, caring friend* 
ship Was hard to take. 

Arthur S. Abramson 

' The University of Connecticut and Hasklns Laboratories 

■/ . : 

/ Dennis Butler Fry 
1907-1983 
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SKILLED ACTIONS: A TASK DYNAMIC APPROACH 
Elliot Sallzman and J. A. Soott Keljto+ - 

Abstract . A task dynaiOtc approach to skilled movements of multlde^ 
gree »of freedom effector systems has been developed In which 
-task^peolf lci relatively aut^omous action units are specified 
within a functionally defined dynamical framework. Qualitative 
distinctions among tasks (e.g.* the body m^'ntaining a steady vertl* 
* cal postnre or the hand reaching to a single spatial target versus 
cyclic vertical hopping or repetitive hand motion between two spa- 
tial < targets) are captured by corresponding distinctions among 

' dynamical ^ ^oPoloplea (e.g.* point attractor versus^ llcdt cycle 
dynamics? defined at an abstract task space (or work space) level of 
description. The approach provides a unified account for several 
signature proper'Cles of skilled actions: trajectory shaping (e.g., 
hands move along approximately, straight lines "^during unperturbed ^ 
reaches*) and Immediate coinpensati^n (e.g., spontaneous adjustments 

<o,ccur over an entire effector system* If a'^glven part..ls disturbed en 
route to a goal). Bath of these properties are viewed as Implicit 

t consequences ot*a^ task *s underlying dynamics and* Importantly » do 

* not'requlre explicit trajectory plans or repllannlng procedires. Jwo 
versions of task dynamics are derLved ( control law; n etl p or k coy- / 
plin g) as 'possible methods of control apd coordination In artificial 
(robotic* l>ro3thetlc) systems, and the network coupling version Is 
explored a biologically relevant control scheme. ' ^ 

I) Introduction , , * 

for animals to function effectively In their environments, their move- 
ments toust be coordinated In space and time. Though self^vldent, this fact 



+Also University of Connecticut. 
Acknowledgmen ts The preparatlo>v of this manuscript was supported In part by 
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raises a most fundamental is^ue th: t h^js recently attracted a number of 
disciplines rar^ging froiu neuroscience to robotics and cognitive science, 
viz. how coordination and contxol arise in co!i?)lei» multi variable systems. 
HcM are the many degrees of freedom adaptively harnessed during coordinated, 
skilled actions? A deterrent to viable solutions to this problem rests in 
part in our ^'lijiited ability to recognize the significant informational units 
of ttovenent". (Greene, 197V, p. iviii; see also Szentagothai & Arbib, 197**). 
For some tine» it has seemed questionable to us that nervous systeir) work 
through individualized control of coDfionent elements, whether they be thought 
of as Joints or touscles. Instead, we believe (and there is an increasing 
amount of evidence to support the clsim^ that thd many potentially free vari- 
ables are partitioned naturally into collective functional ^units within which 
the coiqjonent elements may vary relatedly and autonomously. The behavior of 
these action units or ooordinattve structures (Fcft*ler, 1977: Kftlso> Southard^ 
& Goodmant 1979; -Turvey* Shaw, & Mace, 1978) is often eie>pliri*d by t*ie 
istence of relational invariances among kinematic and muscular events during 
activities as diverse as locomotion* speech, handwriting, and reaching to a"^ 
target (see Gri liner, . 1982; Kelso» 1981; Kelso, Tuller, & Harris, T983; 
Schmidt, 1982; Viviani & Terzuolo, 1980). 

The primary focus of the present paper is to characterize the style of 
operation of these proposed action units within what we call a task dynamic 
approach. The t«rm task dynamics follows directly from the view (1) that the 
degrees of freedom coDfirising action units are constrained by the particular 
tasks that animals perform, and (2) that action units are specified in the 
language of dynamics* not, .as is more* frequently assumed, in terms of kinemat-' 
ic or muscular variables (cf. Stein, I982t for an inventory)". Thus pro- 
pose, and seek to elaborate iieret an invariant ccntrd structure that is spec- 
ified dynamically according to task requirements and that ^ives^ ^ise to 
diverse kinematic consequences. , 

The paper is organized as follows: First we expand upon those desirable 
properties of action units that are central to the explication of a task dy- 
naivic framework.' Second^ we |M*esent a short tutorial on topological dynamics; 
a crucial aspect of w*^ich is to link the system's geometrical qualities to *ts 
dynamics In ways that are task-specific. Jhese steps are precursory to \:he 
Introduction of the task dynamic approach^ two versions of which ( control law , 
network coupling ) will te presented. The task dynamic approach will shown 
to provide a viable account of sucb ta^ks as discrete reaching* bringing a cup 
JO the mouth and turning a handle. It can also offer a principled account of 
various compensatory behaviors such as those that occur when an $r * is per- 
turbed during a reaching movement or when the support base is perturbed during 
/Standing. Finally, it will be suggested that the network coi«pling version of 
task dynamics both provides an extension of the control law version and offers 
a new synthesis of recent physiological findings on the planning and control 
of arm trajectories. 

The significance of the ts'sk dynamic approach for a theory of coordina- 
tion and control is. that it ofters a unified account of certain phenomeiia that 
heretof<ire have required conceptually distinct treatments in the movement 
litjerature. In addition, the implications for design and control of robotic 
and prosthetic devices will be apparent. ' In fact, the approach shares some 
but hot all of the features of several current developments in manipulator 
control (cf. Hogan & Cotter^ 1982; . Ralbertt Brown» Chepponis, Hastings, 
Shreve, & Wimberly» 1981). But before discussing the t^k dynamic framework 
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In detail, we will describe the phenomena that led us, In part, to propose the 
present theoretical approach. Indeed, It Is the existence of these phenome* 
trajectory shaping and Immediate compensation— that constitute the .main 
empirical results that the .task dynan^c framework Is designed to explain. 

The first phenomenon, trajectory shaping, refers to the task-specific mo- 
tion- patterns of the termina l devices or end*^f fectors of the effector s y s- 
tems, associated with various types of skill. For example. It has been ob- 
served experimentally that. In reaching tasks Involving two joints (shoulder 
and i&lbow) and two'^spatlal hand motion dimensions, the hands move In <jua-" - 
sl-stralght-line spatial" trajectories from initial to target positions and 
display single-peaked tangential velocity curves (e.g.* Morasso, 1981\. Simi- 
larly and* more obvloUsUy* In cup-tonnouth tasks the gjcasped ^cup maintains a 
splllage*^>reventing, app^ilmately horizontal orlewtatlon en route from table 
to mouth. ^ 

\ ' ^ 

The second phenomenon. Immediate compensation, refers to the fact" that 

skilled movements show taskrspeclf Ic flexibility in attaining the task goal. 
If one part of \he system ts perturbed, blocked, or damaged, the system Is 
able to compensate (assumitig the disturbance Is not **too big**) by reorganlzlnc 
the activities of the remaining parts In order to achieve the original goal. * 
Further, such readjustments appear to otjcur automatically without the need to 
detect the disturbance explicitly, replan a new movement, and execute'the new 
"movement plan. Kelso, Tuller, and Fowler M982) have deaonstr^ed such behav- 
ior In the speech articulators (jaw, upper and lower Up, t<iflgue body) when 
subjects produced the utterances /baeb/ or /baez/ across a series of trials In 
which the jaw was occ^aslonally and unpredictably tugged downward while moving. > 
upward to the final /b/ or /z/ closure (see also Abbs & Gracco, in press; 
Folklns & Abbs, 1975). The systeof's response to the jaw perturbation was mea- 
sured by observing the motions of- the jaw and xipp^r and lower lips as rfell as 
the electromyographic (EMG) activities of the. orbicularis oris superior (upper 

lip), orbicularis oris Inferior ttower-^ lp), and - g en i o gl o flsus t t ofi gu e- b e dy ) 

muscles. The fnvestlgators found relatively "Imnedlate** task-specific compen- 

sation — ( i.e., 50-30 m fr om onagt of ja w-p ull to o na e t of c o mpo i oobopy r e- 

sponse) in remote articulators to jaw perturbatlont. For /baeb/ (in which fi- 
nal lip closure is crucial) they found Increased upper lip activity (motl^-: 
and EMG) relative to the unperturbed control trials but normal tongue activi- 
ty; for /baez/ (in which flnali tongue-palate constriction is important) they 
found Increased tongue actlvlty'relatlve to controls, but normal upper lip mo- 
tion. "Hte speed of these task-specific patterns indicates that compensation 
^does not occur according to traditionally defined ^intentional** reaction time 
processes, but rather according to an automatic, *'ref lexlve** type of organiza- 
tion. Howaver, such an organization is not defined in a hard-wired input/out- 
put manner. Instead, these data inply the^ existence of a selective pattern of 
coupling or 'gating among the component articulators that is specific to the 
utterance produced. Essentially, then, such compensatory behavior represents 
the classic phenomenon of motor equivalence (Hebb, 1949; Lashiey, 1930) 
according to which a system will find alternate routes to a given goal if an 
initially traversed route is unexpectedly blocked. 

What -tyPe of sensorimotor organization could generate, in a task-specific 
manner, both characteristic-trajectory patterns for unperturbed movements and 
spontaneous, conpensatory behaviors for perturbed movements? We believe that 
a task-dynamic approach provides at least the beginnings of a cohesive answer 
to this question. Let us examine these issues, then, beginning with an 
("jverview of action unit propertied. 
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II) Units of Action 

There are three major points to be made concerning our description of ac- 
tion units: 

1. Functional definition; Special purp r le device . Action units are defined 
abstractly In a functional* task^speclflc fashion and span an ensemble of many 
muscles or Joints. Thu^« they are not defined In a traditional reductionist 
sense relative to slngZe muscles and/or Joints* nor are they hard*wlred In- 
put-output reflex arrangements. These units serve to constrain the rous- 
cle/Jolnt components of the collective to act cooperatively In a manm * 
specific to the task at hand. For different skilled actions* performers 
transform the limbs temporarily into different special purpose devices whose 
functions match the tasks bej ig performed. Thus, an arm can become a 
retriever, puncher, or polisher; a leg may become a walker or kicker; the 
body can become a dfncer or swimmer; the speech organs may become talkers, 
singers, chewers, or swallowers, etc. 

2. Autonomy * Action units operate relatively autonomously and are to a large 
extent self-regulating. That Is, once a given functional organization Is es- 
tablished over a muscle/Joint collective, the system achieves its goal with 
minimal ^voluntarv^ intervention. In later discussions of the matheiuatics of 
task dynamics, we will also indicate that action units are relatively autono- 
mous, in a strict mathematical sense, i.e., the equations describing task-dy- 
namic systems are not explicit functions of an Independent time variable. 

3. Dynamics . Action units are defined in the language of dynamics , not 
kinematics (e.g., Fowler, Hubln, fiemez, & Turvey, ^%0; Kelso, Holt, Kugler, 
& Turvey, 1980; Kugler, Kelso, & Turvey; 1980). The behavior of an effector 
system is controlled by a task-specific patterning of the system's dynamic 
parameters (e.g., stiffness, damping, etc.) according to the abstract func- 
ttonal demands of the performed skill. Suc h dynamical patterning serves to 
convert the effector system into the approprl^tft^i^ask-demanded special purpose 
device. Further, this patterning both generates the observable motions that 
are characteristic of that skill and underltes^-the ability to con^wnsate 
spontaneously for unpredicted disturbances. There is no explicit plan for the 
desired kinematic trajectory in the action unit, nor is there an explicit con^ 
tlngency table of replanning procedures for dealing with unexpected perturba- 
tions. Rather, task^speclfic kinematic trajectories and compensatory behav^ 
lors emerge from, or are iDq)llcit consequences of, the action unit's dynamics. 
In this sense, most robots (with at le^st one notable exception, i.e., Ralbert 
et al., 1981) have no skills, but are controlled Instead as general -purpose 
devices using the same dynamical structure for all types of tasks, e.g., spa- 
tial trajectory planning for the terminal device, conversion to a joint 
velocity plan, and joint velocity servoing for both manipulators (e.g., Whit* 
ney, 1972) and hexapod walker legs (e.g., McGhee & Iswandhi, 1979). 

Given the above \.hree points, one can formulate the problem of skill 
learning as that of designing an action unit or coordlnatlve structure whose 
underlying dynamics are appropriate to the skill being learned. That is, in 
acquiring a skill one is establishing a oneHo-one correspondence between the 
functional characteristics of the skill and the dynamical pattern underlying 
the performance of that skill. This correspondence between dynamics and func* 
tion is perhaps the key concept underlying the task-dynamic approach. To ex^ 
plore it JDor^ fully we will now: a) oxamine the geometric notion of topology 
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as It relates to a system's dyncnlcs; and b) describe how functionally 
specific dynamical topologies can be used to specify task^speclfic action 
units or coordindtlve structures. 

Ill) Topology a n d Dynamics 

Quite (and perhaps too) simply in the context of skilled action, topology 
refers to the qualitative aspects of a system's dynamics, e.g., whether a sys- 
tem's dynamics generate 1) a discrete motion to a single target or 2) a sus^ 
talned cyclic motion between two targets. ^ For a one-degree-of-freedom rota^ 
tlonal system such as the elbow Joint (f lexlon^xtenslon degree of freedom) 
the first motion type might correspond to a positioning task with a sln^e 
Joint angle target, while the second might correspond to a reciprocal tapping 
task between two Joint angle targets. What sort of dynamics might underlie 
these qualitatively different tasks? For the discrete task, several Investi- 
gators have hypothesized that the system can be modeled as a damped 
mass-spring system (e.g., Cooke, 1980; Fel'dman, 1966; ICelso, 1977; Kelso 1 
Holt, 1980; Pollt & Bl7zl, 1978; Schmidt 1 NcG^n, 1980). Sudti a dynamical 
system muy be described by the following equation of motion: 



I)e + b< + k(x - Jt^)=o, where (1) 

I = moment of Inertia about the rotation axis; 
b = daiq>lng (friction) coefficient; 
k = stiffness coefficient; 
X = equilibrium angle; 

X, i, X = angular displacement and Its respective first and second 
time derivatives. 



If we assume a set of constant dynamical parameters (I, b, k, x^), then the 
behavior of this system can be characterized by Its point stability or 
equlflnallty , in that It will come to rest at the specified x ''target de- 
spite various Initial conditions for x and i and despite any transient pertur- 
bations encountered en route to the target. >i 

The behavior of such systems can be displayed graphically In two differ^ 
ent ways. In FlgMre 1A, the angle of an underdamped mass-spring with constait 
coefficients Is plotted as a function of time for a giver, set of Inltia 
conditions and with no perturbations Introduced. Defining the equilibrium or 
rest angle as aligned with the abscissa, one observes the system's point sta* 
blUty In the progressive decay of the aof^lltude to the steady state rest an- 
gle. In Figure 16, the same trajectory Is represented alternatively In the 
phase plane for which the abscissa and ordinate correspond to x and i, 
respectively, and In which the system's x^ is located at^the phase plane or- 
igin. In the phase plane, the system's point stability may be observed as the 
trajectory spirals down to the origin. Theoretically, If one were to plot the 
phase plane trajectories corresponding to all possible Initial conditions, one 
would fill the plane with qualitatively similar d<±caylng trajectories defin- 
ing, thereby, the system's phase portrait . The qualitative "shape** of the sys- 
tem's phase portrait reflects the system's dynattolcal topology. I.e., the char- 
acteristic relations among the system's underlying dynamic parameters. For 
the type of system described by equation 1* the corresponding phase portrait 
represents the topology of a point attractor (Abraham 1 Shaw, 1982), and the 
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Figure 2. Phase portraits for neutrally stable system (A) and periodic 
attractor system (B)t showing system trajectories for several ini- 
tial conditions. 
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underlying dynamics may be described as point attractor dynamics. As a model 
of discrete positioning taska» such point attractor dynamics are appealing 
since the same underlying topology will accommodate different trajectory 
characteristics (e.g» peak velocity^ movement time) and target positions by 
speclflcatl:>n of different values for the system's dynamic parameters. 

Obviously^ another type of dynamics is required to generate the klnemat* 
ics observed In a sustained cyclic elbow (or i*lnger» e.g.» Kelso» Holt^ Rubln» 
9t Kugler» 1981) rotation between two target .angles. Perhaps the simplest dy- 
namic scheme corresponds to that of an undamped mass-spring system or harmonic 
oscillator^ with the following equation of motion: 



IK + k(x - x^) 5 0, (2) 

where all symbols are defined as in equation (1). The solid line trajectory 
In Figure 2A represents the phase plane orbit of such a 3y3tem» which 
oscillates about thV origin (Xq) with an amplitude that Is determined by the 
system's total mechanical energy* and whose angular targets correspond to the 
system^s maximum and minimum angular limits. * However* this type of system 
does not provide a satisfactory model for the cyclic elbow task for two rea- 
sons: a) It represents the Ideal frlctlonless case and no real world system 
Is frlctlonless. Adding friction to equation (2) would simply convert It to 
^uatlon (1)» leaving a point -attractor dynamics unsuitable for any sustained 
5ycllc task; and b) the system described by equation 2 Is only neutrally sta- 
ble In that the oscillation amplitude is extremely plastic with respect to 
both Initial system energy (determined by initial conditions of position and 
velocity) and tran^'flent changes (petaurbatlons) In energy Imposed during 
oscillations. For example, the dotted trajectories in Figure 2A represent 
oscillation? of the same system as does the solid trajectory. However, the 
inner and outer dotted orbits show the oscillations corresponding to smaller 
and larger iifrUtude initial conditions, respectively, relative to the solid 
orbit. Clearly then» for a task whose oscillation amplitude Is crucial, a 
neutrally stable system is undesirable. 

One can overcome the above shortcomings of an undamped mass-spring dynam* 
lcs> however^ by moving to an alternate periodic attractor (Abraham 4 Shaw, 
T982) dynamical models with the following equation of motion: 



I3t + b< + k(x - x^) - f(xt<)t where (3) 

I,b k»AQ»x»i»X are as in equations 1 and 2; and 

f(xt<) » nonlinear escapement function of the system*s ciTrent x,<. 

This system*s behavior Is characterized by the three phase plane trajectories 
seen In Figure 2B corresponding to three different sets of Initial conditions. 
The solid trajectory represents a motion starting at either target, and the 
Inner and outer dotted trajectories represent motions starting Inside and out- 
side, respectively^ of the target^o^arget angular range. It can be seen 
that these trajectories converge onto the solid orbit, which Is described as a 
stable ll fflt cycle or periodic attractor. In fact, all trajectories (except 
those starting exactly at x ) converge to the limit cycle, and the corre- 
sponding phase portrait captures the topology of this periodic atfractor 
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dynamical system. The reason for this orbital stability lies in the nature of 
the nonlinear escapement termi f(x,jt)| seen in equation 3. 'Basically this' 
term is the ooeans by which the system taps an external energy source in a 
self-gated manner, i.e., energy is gated in or oi/t or the system as a function 
of the system's current x,f state. On the limit cycle, the energy tapped per 
cycle from the external reservoir is equal to the energy dissipated per cycle 
both by the system's intrinsic damping properties'sCi.e., b*) and ^he system*5 
escapement function. Inside the limit cycle, the energy tapped .per cycle is 
greater than that dissipated, and trajectories groi/ or spiral out to the limit 
cycle; outside the limit cycle, the converse situation holds, and trajecto- 
ries decay or spiral down to the limit cycle (cf. Minorsky, 1962). 

The above exaiq>le3 illustrate how particular distinct task functions 
(discrete positioning vs. cyclic alternation) nay be modeled by topologically 
distinct dynamical systems. It should be hoted, however, that both tasks and 
dynamics were defined in single degree of freedom systems. In these cases one 
dimensional motions were demanded by the tasks and these task requirements 
were mapped directly onto corresponding dynamical control types at s single 
Joint* This style of control* in which task-specific set9 of constant dynamic 
parameters are defined with respect to control at single Joints or articulator 
degrees of freedom, may be labeled arlicjlator dynamics . Real world tasks 
seldom involve such simple one*to^ne mappings of task demands into sets of 
constant articulator dynamic parameters. Consider, for example, the two di- 
mensional discrete reaching task discussed earlier in the Introduction involv- 
ing two articulator degrees of freedom (shoulder, elbow) and two spatial di- 
mensions of terminal device (hand) motion. Extending an articulator dynamics 
approach tc this more cofiV)lex task meets with only limited success, providing 
a reasonable account of final position control but failing to account for the 
observed characteristic quasi-straight line hand trajectory patterns 
(Oelatizky, 1982). More specifically, in this two dimensional task the arm is 
effectively nonredundant (e*g., Saltzman> 1979) and, given the anatomical 
limits on joint angular excursion, there is a unique mapping from hand posi- 
tion to arm configuration (i.e», the set of shoulder and elbow angles; arm 
poature). Therefore, if one defines constant point attractor dynamics at each 
Joint with rest angles corresponding to the target arm configuration (and thus 
target hand position), the hand/arm will exhibit equifinality by attaining the 
desired target ' .^ition/configuratiori despite variation in initial posi- 
tion/posture Respite transient disturbances encountered en route to the 
target* However, as mentioned above (and tc^ be explained tn greater detail 
below), sr'^h an articulator dynamics approach fails to account for the charac- 
teristic trajectory patterns seen in these reaching tasks, l*e., this approach 
does not **favor straight line toovements over other movements** (Hollerbach* 
1982, p, 190), 

At this point, then, those committed to a dynamical account of coordinate 
ed onvement face a nasty dileoroa. The conceptually parsimonious account ^of 
motor control via articulator dynamics no longer appears valid* That is, the 
elegance of the articulator dynamic account for single degree of freedom tasks 
lay in its use of a set of constant, tas!; specifio, artioulator-dynamic param* 
eters to generate a potentially .infinite number of -taskrappropr-tat^^^kinematio _ 
trajectc^ries. The failure of such an approaoh when extended to trajectory 
shaping in a multidegree^f*freedom task as simple as reaching shows that 
searching for invariant task-specific aotion units at the level of articulator 
dynamios is likel> be a frustrating and probably pointless endeavor. What 
type of principles or control structures might underlie the trajectory con- 
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stralnts on arm motion during reaching tasks? There are (at least) two alter- 
native accounts. The first 16 singly to abandon the oynamlcal approach 
altogether* and Invoke explicit kinematic trajectory plans as sources for the 
characteristic constraints on motion patterns observed in different tasks. 
Such an approach has >)een generally adopted in'^. the field of robotics (e.g.* 
Kollerbach* 1982; Saltzman* 1979)* and has been described In the following 
fashion by Kollerbach (1982): 

A hlerarchal movement plan Is developed at three levels of abstrac- 
tion... The top level Is the object level* where a task command* such\ 
as *plck up the cup** is converted Into a planned trajectory Cltal- 
Ics added] for the hand or for the object held by the hand. At the \ 
Jolnt^ level the object trajectory Is converted to co-ordinated con- 
trol of the inultlple Joints of the tiuman or robotic arm. At the 
actuator level the Joint movements are converted to appropriate mo- 
tor or muscle activations. 



Alternatively* the second account Involves defining dynamical control to- 
pologies at a level of task description more abstract than the level of 
Individual Joints. This leads us to a task -dynamic account of skilled ac- 
tions. 

IV) Task Dynamics 

Previous arttculator-dynamlc descriptions of skilled movement provided 
plausible accounts of only a very limited type of data: that obtained In lab- 
oratory tasks where unl-dlmenslonal tasks mapped directly onto control -at a 
single Joint. For example* a discrete target acquisition task was thought to 
Involve specifying the dynamic rest angle parameter corresponding to the 
task's target joint angle. However* given the failure of articulator dynamics 
to account foV data observed In more coiq)lex imaltl variable tasks* one beglDs 
to suspect that this approach migjit be Inappropriate even as a model for con* 
trol of single variable tasks. More specifically* one reaches the conclusion 
thac the dynamics underlying control of a sln^e Joint task might be defined 
more abstractly than at the articulator level (or Joint level; see Holler- 
bach's* 19^'2, quote above). 

On the basis of a logical analysis of performances across; a set of 
multl variable real world tasks* two common aspects shared by all tasks become 
evident: a) tasks are typically defined for the terminal devices associated 
with task-relevsmt multidegree-of^freedom effector, systems (e.g.* the grasped 
cup and arm-trunk* respectively* for a cup-to-mouth taskU and b) tasks typl* 
cally demand characteristic, patterns of ^ motion or force by these terminal 
devices relative a set of task-Specific spatial axes or degrees of freedom^r 
Thus* a given task tj^'pe can be associated with a corresponding task-spatial 
coordinate system ( task s pace ) that is defined on the basis of both the termi- 
nal devices and the environmental objects or surfaces relevant to the task's 
performance. In fact* Soechtlng (1982) has presented evidence from a pointing 
- t a sk ln va hfin g^tfar^facw^-Jotnt^-th>fer-iii^>lie^ -that— the— cftntroHed—variable--i3 
not Joint angle per se * but rather the orientation angle of the forearm in a 
spatial coordinate system defined relative to an environmental reference 
(e.g.* the floor surface* or gravity vector, etc.) or the actor's trunk. This 
suggests that a task-spatial coordinate sysx^em might indeed be the appropriate 
level at which to characterize a skilled action. 
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The central tenet of the task-dynamlc approach is that a set of constant 
task-dynamic parameters can be defined for each of a given skill's task^pace 
degrees of freedoin» defining* thereby a one-*to^ne correspondence between the 
func~tl6nar characteristics or the sklTlf OT<r ttre'ta^k^^dynamteai-^topology^uiKJer — 
lying that skill's performance. In other words, sklll*lnvarlant action units 
are defined functionally relative to a given skill's task space and underlying 
task-^patlal dynamics (more simply, task dynamics). Such sets of constant 
task-dynamical parameters may be used to define changing patterns of artlcula-* 
tor-*level dynamic parameters (e.g., joint stiffnesses, dampings, rest angles, 
etc.) according to two related versions ( control law and network coupling ) of 
the task dynamic approach. The evolving constraints on articulator dynamics 
serve to convert a given akill's effector system into an appropriate special 
purpose device whose individual components (i.e., articulator degrees of free* 
doaX act cooperatively it} a manner specific to the task at hand. It should be 
remembered for purposes of comparison that the articulator-dynamlcs approach 
postulated sets of constant dynamic parameters at the inaividual joint level; 
a given> set would underlie the resultant variety of equifinal kinematic tra* 
jectories for a given type of single degree of freedom task. In contrast, the 
task dynamic approach postulates sets of constant dynamic parameters at^the 
task-spatial level for given types multi variable tasks. A given set of 
such oarameters would: a) underlie directly an articulator estate dependent 
patterning of articulator dynamic parameters; and b) underlie ultimately the 
task^specif ic trajectory patterns and cosftensatory behaviors observed during 
ta3k performances. tfe will now provide an overview of the specifics of the 
task -dynamic approach, using a relatively^ 3iav>le arm reaching task for 
illustrative purpose'^. A schematic of the approach and the coordinate 
transformations involved Is shown in Figure 3. 

*• Task dynamics ; Task network 

1. Task^^pace . A task-dynamic approach to a given skill begins with an. 
abstract, functional description of« that skill's task space. Such a descrip- 
tion has three parts. First, the relevant terminal devices end goal objects 
or surfaces are defined. Second, an appropriate number of t^sk axes or de- 
grees of freedom are defined relative to the terminal device and goal 
referents; and finally, an appropriate type of task dynamic topology is de- 
fined along each task axis. For a discrete reaching task in two spatial di^ 
mansions, the corresponding task space is modeled as a two-dimensional point 
attractor and Is illustrated in Figure UA. Ir. this figure, the reach target 
(x ) defines the origin of a t^tg Cartesian coordinate system. Axis tj 
(the Veach axis^) is oriented along the line from the target to the initial 
position of the terminal device (open circle), which is modeled as an abstract 
point tasknnass . Axis t2 is defined orthogonal to t^ ztXd measures devia^ 
tions of the task masa from the reach axis. The tasknnass ^is allowed to 
asjume any t^t^ position (filled circle) during task perforinance, and may 
be considered an abstract point mass since it is not tied to any particular 
effector system. The equations of motion corresponding to axes t^ and t2 
are as follows; 



-b^2^ 2 ^ k^ 2t 2 * ^ 
mj s tasknnass coefficient; 
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Figure 3. Overview of descriptive -levels in task dynamic approach. 




Figured. A. Discrete reaching (task space); D. System trajectories corre 
spondlng to dl.fe'-ent .task axis .weightings and Initial conditions. 
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b^^, tXp2 = damping coefficients; 
'^Tl* ^^12 " stiffness coefficients. 

In Figure ^A the corresponding daDq)ing and stifiness elements are represented ' 
in lumped form by the squiggles in the lines connecting the task mass to axes 
t ^ and t2* Equation (4) describes a linear, uncoupled set of task-spaj;idl 
dynamic equations, whose terms are defined in units of force, and whose dynam^ 
ic parameters are constant. This equation can b€ represented in matrix form 



^7% * * ^t- ^ ' "here 



^1 



T2 



'T1 



(5) 



01; 



'T2 
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It should be noted' that there are two nested sfructures ot dynamical con^ 
straints at the task ^pace level.* The first constraint structure is oefined 
globally, and serves to establish a task-specific dynamical topology. In our 
reajching example, these glob^al constraints on the task-dynamic coefficients 
specified point attractor topoloifies along each task axis. Additionally, hCM-* 
everr a set of locally defined* metrical constraints serve to t'utle the 
task-spatial dynamic parameters (M^, Bj, K^) according to current task de-^ 
manda. Thus* in the reaching exan^ile m^ designates the perceptually 
estimated mass of the tet^mlnal device (i.e., , gripper ^ any grasped ob- 
ject-to-be-moved), and and are specified, f or ex ample, according to the 
desired or required dampi ng ratios i^j. = bjj^/[2Jm-Jcj7 3! i ^ and sSir 

tling tiroes (T = ^/T%^Jk!^7E> }; i =n,2; x,e., the time required for the 
aystW to settle within 2% of ti^e target amplitude; Dorf, 197^) along each 
task axis. ^ ^ 

^he movements of the task mass in reaching space display two properties 
highly desirable for the terminal devices of real world reaching tasks. Due 
,to the ^point attractor dynamics, the movements will exhibit eguifinallty in 
that the task mass will come to rest at the target regardless of initial posi-^ 
tion (i.e. , by definition* initial distavrce along t^^ and velocity (i.e.* 
initial direction and speed of task space motion) and despite transient 
perturbations introduced en route to the target. Additionally* the task mass 
Will sh<M straight line tr ajectories during, unperturbed motions to the target* 
since in this case the system is effectively one-n^imensional by virtue of the 
definition of the reach axis. However, motions in which the ta^k mass is per- 
turbed away from the reach axis widl display trajectory shapes that depend on 
the relative values of and kj^ (assuming equivalent daTping properties 
along each axis) as well as the position in t]t2 Space where the perturba- 
tion **deposits** the task mass (see Figure ^B). Assuming critical damping 
along both task axes (i.e., $ ^ ^*^' ^ ~ ^«^^ ^ post-perturbation 
velocity of zero, then: a) wnei k^^ < kj2* the task niass will approach 
the reach axis faster than axis K^i b) when k*^ ^ ^T2' ^^^^ 
" ' - ^ ^- ' - " -) wh 



will approach axis t^ faster than t^; and c 



en ^) - kj2r the ta9k 
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m$3s will approach and tg at the same speedy showing a straight line 
post-perturbation trajectory to the target* A straight line post-pef-tiirbatlon 
trajectory will also result lf» regardljess of the relative values o£ k^^ and 
kj2* . the task-mass is deposited precisely on the t2 axis (Figure 4B, 
trajectory 6)* The reason for these relationships between perturbed position^ 
relative axis stiffness, and trajectory shape lies In the shape of the poten* 
tlal energy functions corresponding to these different relative k^^ and 
kj2 values^ and the resultant constraints placed on the ensuing motions of 
tne ta^k-mass when starting at various post-perturbation locations on these 
manifolds (see Hogan» 1980» for a more detailed discussion of potential energy 
functions and spring stiffnesses In a similar two-dimensional mass^sprlng sys- 
tem)t* Finally^ note that these free and perturbed trajectories evolve as In* 
pllclt consequences of ch-? underlying, task ^^space dynamics and» therjet^ore^ do 
not reflect the use of either explicit trajectory , plans or replannlng proce- 
dures^* - - - . 




Figure 5«^'I>Jscrete reaching; A« Body space* Task space is embedded In a 
shoulder -centered* coordir*Jte system; B* Task network* Body space 
description Is transformed Into Joint variable form of massless 
model arm* - * -i 



^* space * The above, patterns of task spatial dynamic parameters 

were defined relative to an environmentally defined goal location and an ab-. 
stract dlseiid)odled terminal device*. If these patterns are to be useful to a 
performer^ they must first be transformed into egocentric or body spatial form 
(e*g*i Ssitzmani 1979)* Such a transformation must be sensitive to the cur- 
rent spatial or geometric relationship ^ between the performer and the task 
space* Ac illustrate* in Figure 5A for a reaching. task» thi|l corresponds to 
locating and orienting the task space relative to a body spatial (x^» x^) 
coordinate* system whose origin corresponds to the current location of tne 
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shoulder's rotation axis* Thus* the . terminal device's (task*mass *s) current 
location may be specified in x^x^ coordinates. Further, the set of local* 
ly defined constraints gfven by the spaticl relationship between task and body 
spaces serve to tu ne the body spatial dynamic parameters s ^*oi* 
Xq2) tthe location of the task space origin in body space coordinates) 
ana O tthe orientation angle between the task spacers reach ax|s t^ and body 
space axis x^). Given this information* the task*spatial dynamical pattern 
may be trdnsformed into a corresponding body or shoulder s?atial pattern* The 
resulting set of linear body-spatial equations of motion for the task^s termi* 
nal device are defined in matrix form as follow^ (Note: In these and the 
following equations, a superscript T denotes the vector matrix transpose 
operation): * *t ^ ^ 



Kgiix = where (6) 
Mg = Hj.R, where Hj. = task space mass matrix; and 

R = the rotation transformation matrix with elements r^^. converting 
task space variables into body space form; 

^ cos D 5in ID ; 

-sin 0 cos D 

Bg = BjR« where B.^ ^ task space damping matrix 

Kg = K^R, where ^ task space stiffness matrix 

T 

$^ ' h *Xo* where ^ = (x^Xg) * the current body space posi* 
tion vector of terminal device; apd 

T 

io ' ^^01**02^ * ^^^^ space position vector of the task 

space origin* ; 

One should note that equation (6)*. unlike equations (4) and (5)» represents a 
set of (usually) coupled, autonomous bodv jpatial dynamic equations (i,e*» the 
off^diagonal terms are getierally non-zero) due to the rotation transformation* 
However, as in the C2(se of the task-dynamic parameters, the terius of (6) are 
defined in force units and the resultant set of body spatial dynamic parame- 
ters is constant* 

3» Joint variables; Task pynam j c Network* The above patterns of body 
spatial dynamic parameteri were defined with reference to motions of an ab- 
stract terminal device disejnbodied from its effector system* ^These patterns 
may be further transformed into an equivalent expression based on the 
joint-variables of a massless ^model*' effector system* L^e the transforma- 
tion from task^space to body space* this transformation is a strictly kinemat- 
ic one and involves only the substitution *of variables defined in one coordi* 
nate system for variaMes defined in another coordinate system* As illustrat* 
ed in FifflAT© 5B, this corresponds to eitpressing body spatial variables (j^» i# 
j|[) as functions of an arm model *s kinematic variables (d« ^* |)* where 
i ' Wj$^2^ * **! " shoulder angle defined relative to axis Xgt 
^2 ' elbow angle defined relative to the upper arm segment* It Should be 
emphasized that the sodel arm used for this transformatiot) is defined in 
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kinematic terms only,(i.e.f the proiimal and dista? segments have lengths 1. 
ano '^2* t'^sp^ctivelyv but no masses)* and that the arm^s proximal (shou.Mer) 
and dfstal (wrist) ends are* attached to the body space origin and the terminal 
device/task mass* respectively. The transformed ^equation is as follows (see 
Appendix A for details): 

Vi ^ ViS ^ Kg«(jg) = -MgVifp. Where (7) 

'^B* ^ ^''^ same* constant matrices used In etjuatlon (6); 

^xU) = xi6) - x^, where 

x(|j) - ('^i(i)tX2(j()) f the current body space position vector of 
the terminal device expressed as a function of current Joint «n* 
gles; . 

5q = the same constant vector used in equation (6)f 

J '^^i-t the Jacobian transformation matrix Whose elements are 



partial derivatives* ^^^^^y evaluated at the current jgf 

-P 



* ^2*^2^^* current joint velocity product 



vector; and 

V = V(]g)v a matrix of coefficients associated with introduced 
during the kinematic transformation and evaluated at tife current 6* 

One should ,.ote that the matrix proaucts in equation (7) are not con- 
scanty but are nonlinearly dependent on the current arm model posture jg via 
the configuration dependence of the *'(]$) and V(rf) matrices. Further^ although 
equation (7) is expressed in terms of articufator or effector aystem vari* 
ablesv it is by no means an articulator-^ynea^c equation. Rath^« it is sim- 
ply the body-spatial dynamic equation (6) rewritten in- the articula- 
tor^inematic variables of a massless arm model with no reference to the Actu- 
al mechanics of a performer's corresponding real arm. Its terms* in fact* ar^ 
still defined in unit? of force not torque. Thus» if the initial state. 

the arm model in equation (7) specifies an initial body-spa- 
tial wrist position and velocity equal to the initial position and velocity 
for the task-ioass in equation (6)» the arm model's Joints will change (via 
equation (7)) in such a way that the wrist moves along exactly the siime 
trajectory as would the abstract terminal device (via equation (6)). 

Equation (7) may be rewritten in units of angular acceleration: 

9 



For reasons to be elaborated further in the sections to follow* we consider 
equation (7) to define the task dynamic network ( task networkO for our reach- 
ing task example since* in effect » this equation describes a network of 
task-* and context^peclf ic dynamical relations amohg the arm model's articula* 
tor^inematic variables. Ultimately* however* a reaching task is performed by 
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a real arm whose motions and responses to\ perturbations ^re shaped according 
to task**speclfict evolving patterns ot articulator -dynamic parameters/ In the 
task dynamic approach* constraints ar#. :iupplied for these articulator dynamics 
with reference to the tasSc network equation (6). 

tfe will no^* review the basic articulator^dynamics of a siiiq?le two-*jointed 
arm* and then discuss two alternative ways in which e<]U0tion (8) might be used 
to qpnstrain these dynamics for « reaching task. ^ 

B.^ Articulator dynamics ; Articulator n etwork 

For the purpose of siv^licityv »Me will restrict our discussion to a 
two^jointv two«-3egment effector systen whose segments (^upper arm** and **fore-^ 
arm**) have ' lengths 1^ and l^f with* ' masses nf. and m^ uniformly 
distributed along the %r«spective segment length?. ^Assuming Tric.ionlesa 
revolute joints (O^^O^;' defined, in the same manner as for the mvdel arm) 
and to gravity V the passive mechanical (no^ controls) articulator dynamic equa* 
tions of E.tiOn> whos^ terms are defined in units of torque, are (see Appendix 
B for details): . ' ' 



■J 



(9) 



\ 



Mj^ 3'M.{g), the 2 X 2^ acceleration sensitivity matrix as.wiated 
witn inertial tor^iues, whose elements are functions of the current 
linkJEige conf iguration, The subscript **A** denotes articulator 
dynamic elements; 

^A * ^A^S^» a 2 X 3 matrix associated with coriolis torques 



s (g)^ a d A s matrix associated with coriolis torques 
(related to joint velocity cross products) and centripetal torques 
(related to squares of joint velocities), who 
tions of the current linkage configuration^ jg. 

With controls included^ this equation becomes: 
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"a* t Sj^gp . Bj^ft . Xas ^ * lAa = 2^ (^0^ 

or - 

^ A*^ 
Bj^ 5 a 2 X 2 control damping matrix; 

3 a 2 X-1 Qf^ntrol spatial ^spring torque vector; 
Kj^ « a cof^trol 2X2 Joint«stiffness matrix; 



*S ' S * J&o# where ' a 2 X 1 control reference confijturation 
vector; and ^ 

Xia « a 2 X 1 control additional torque vector ^ whose function will 
be described more fully/ in the following section on Control Laws . 
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Equation (10) may be rewritten as follows with terms defined In units of 
angular acceleration: 



or 

A A^ 



Just as we considered equation 8 to define a network of task-dynamical rela- 
tions over a kinematic arm models we also consider equation It to define an 
articulator dynamic network ( articulator network ) of relations among our (sim- 
plified) real drm*8 ;}oint variables. 

The task dynamic problem for our reaching example (and other real world 
tasks as well) my now be posed as the question of how to specify patterns of 
articulator dynamic controls (Equations 10 and 11) such that^ the resultant 
terminal devl<fe*s free and perturbed kinematics evolve according to con- 
straints ^lAbodled In the corresponding task spacers topological dynamics 
(Equation 5). We consider two related methods In the sections b^low based on 
alternate versions of equation (11). The first method uses equatloiis 6 and 11 
to formulate task-specific equations of constraint or control laws over the 
articulator dynamic parameters; the second combines the use of control laws 
with the concept of network coupling between the task (equation 8) and articu- 
lator (equation 11) networks. Both methods address the Issue of coordination 
In artificial (robotic, prosthetic) linkage systems. ' The network coupling 
method also affords a novel perspective on styles of control In physiological 
systems. In the foUowltig section^ the control law approach described* 
while the network coupling method will be discussed In a later section on 
physiological modes of motor control. 

C. Method Jj_ Control laws ' . 

This method is conceptually qul'ce simple and is outlined In Figure 6. 
Flrst» one assumes that the model arm state (j^tiO equals the real arm state 
(0»0) and that 0 and 0 (hence, also* and j^) are specified proprloceptlvely. 
Second, one uses the following version of equation (11): 



Third, by comparing equations 8 and 12, one can see that the real arm i% vari- 
ables ) w 111 move according to ta^ dynamic requirements (I.e., will inove 
Identic 
Identli 

define the fcfilowlng nolillnear, ^ate-dependent* articulator dynamic control 
laws: 



; will move according to taatc aynamc requireiaents u.e.> will move 

Lcally to the task network's model arm variables]) when the following 

Ltles hold: a) J"^Mb B J = mIb.; b) J' MgK„4^(0) - m''^. ; 

;) -r'vi 5 mJs g + Finally, one uses these Identities to 



Ias " V"'">Bi2f«ft> *^3b) 
TAa = <«A*^''v-Sa>Sp «13C) 
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It should be noted that the articulator dynamic controls In equations 13 
are defined by the linkage configuration (g)- or state(0,i)* dependent products 
of: a) M^,S J,J* ,V,;((0), and & —these are or 0 -dependent, but 

task independent;? and b) Hj.Bg.lt;, and x these are constant, but 

are dependent on both spatial context and task. Finally, one should note that 
for purposes of slripllclty we have assumed that the computations involved In 
equations 13 occur Instantaneously. However, In reality this cannot be the 
case and hence there must be a delay Ut) between sensing a given linkage 
state (at t t.) and the specification of a task* and context-specific set 
of controls (at ^ = + At). It Is possible, therefore, that these con- 
trols will be totalljT Inappropriate for the current (t ^ t +At) linkage 
state. There are two main ways to deal wlt^ this problem. The first Is to 
minimize At by using a varlet'y of methods: a) table lookup (e.g., Ralbert, 
1978) for those terms In equations 13 that are Independent of the current spa* 
tlal and task contexts, but can be indexed according to current arti^ulatory 
state ; b ) parallel computation procedures , such that all element s In all 
matrices In (13) are not conputed sequentially; c) coi^utatlon strategies 
*that heurlstlcally omit certain terms In (13) or that capitalize on th* fe* 
peated use of certain ^modular** functions^ (e.g. , Benatl , Gagllo, Horasso, , 
Tagllasco, & Zaccarla, 1980) in the coo?)onent terms In (13); and/or d) using 
remote sensing (expn«>rloceptlon, e.g., vision) to specify certain kinds of 
lnformatlo.1 directly (e.g., hand position jj) rather than Indirectly through 
coaputatlons based on proprioceptive feedback (e.g., x(0)). The second way of 
reducing the adverse consequences of delays Is to use a predictive, **lookah- 
ead** type of computation (e.g., Ito, 1982; Pelllonlsz & Lllnas, 1979) such 
that given an estimate of delay ^t, the system might sense a linkage state at 
t=tj^, predict the state at t t. + ^t, and perform equation 13*s coin>u* 
tatlons with reference to this predicted state. 



Control Laws 



Articulator Network 



Figure 6. Overview of information flow iii control law version of task dynam- 
ics. 
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V) Further Exaiiq)les 

In the preceding sections we described the details of the control law 
version of the task dynamic approach in the context of a discrete reaching 
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task's point attractor topology. In the present section* we generalize this 
approach to other task types a3 well as to variations on the discrete reach 
task theme. More specifically we describe how the task dynandc model: a) 
generates task specific trajectory shapes in dis:irete reaching* rhythmic tar- 
get-to^arget» cup-^tonaouthi and crank^turning tasks* and b) provides **iiiiQe-^ 
diat^e coiK4>erjsation** to a sustained perturbation introduced to an effector sys-^ 
iem while er ''tute to a target in a reaching task. 

In th<% current control law context^ all examples and computer simulations 
described below represent motions of the articulator network (the *'real*' arm)^ 
and the task network (the ••model*' arm) is rigidly constrained to move identi- 
cally due to the assunqp^tions that ^ 0 and ^ * 0* given the current 
"proprioceptively" specified 0 and *^ 

A. Trajectory taping 

1. Discrete reaching . This is the familiar reaching exafl4)le» whose task 
space is defined as a two-dimensional point attractor (see Figure 4A). A 
straight-line trajectory foe the terminal device (the hand) generated by these 
task dynamics for a discrete reach is illustrated in Figure 7 (trajectory a). 
For this trajectory the task space axis stiffnesses are syanietrical (i.e. » 
i^'^^ ^ k^2^ critical dandling is assumed along both axes. Note» houev- 
er> that Perfect straight line trajectories, are generated in contrast to the 
quasi - straight line trajectories observed experimentally for primates (e.g.^ 
Georgopoulost Kalaska» & Massey» Morasso» 198^! Soechting -A Lacquanti* 

1981). 



t 




Figure 7. Body space discrete reaching trajectories showing effects of omit- 
ting velocity product torque compensation terms with different task 
axis weightings. I and F denote initial and final arm configura- 
tionst respectively. 
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As alluded to earlier (see also i^ootnote 8)» it is possible to omit^^^ 
(i.e., the control vector associated with velocity product tor'ques) from equa- 
tion 12 and thereby obtain mare "realistic" trajectories while at the same 
time reducing the amount of coi^jutation involved in specifying constraints on 
articulator dynamic parameters (trajectory b in Figure 7). As with trajectory 
a^t trajectory b illustrates a reach involving synmietrical task axis^ 
stiffnesses and critical task space damping. Omitting results in an 

articulator network whose velocity product terms are simply tnose specified by 
passive arm Mchanics (i.e., in equation 9) rather than those speci- 

fied ^y the task network (i^'«'»^VjJ in equation 7). Hote that although 
the omission of T^^ int.^oduce3 a *^ook* into trajectory b's illustrated hand 
motion, the hand nevertheless arrives precisely on target due to the underly* 
ing taftk space point a^traotor dynamics. This preservation of accurate 
targeting behavior when control terms related to velocity product torques Sre 
ignored is a featute of the task dynamic approach not shared by some other 
robotic control schemes (e.g., Hoillerbach & Flash, 1981, see their Figure ,8). 
Finally, it should be noted that «>traight line hand trajectories can be 
approximated when t.3 ocrfttcii by a judicious relative weighting of task 
axis stiffnesses, ctand trajectories prog/assively clavier to ideal straight 
lines will be produced using progressively greater penalities for task mass 
deviations from the taskspace reach i^xis (t^) en route to the target. A 
hand trajectory for the arm motion corresponding to one such ratio 
(lC2;k^«1.75;1) with critical daan^int along both task ,^xes is illustrated 
inrigure 7 (trajectory c). 

2. ^up^o-mouth t as k. In a cup^tonnouth task t.he 5021 is to novo a cup 
of liquid^ from an initial to final position (e.g.^ table top to moUth^ while 
maintaining a horizontal spillage-prevtrting cup orientation duiing the move- 
ment. As in our discussion of the discrete reaching task, we begin with a 
simplified tadk-dynamic treatment of a planar cup'^onn&uth task performed by a 
3- joint (shoulder, elbow, wrist) arm usinf an abstract, functional dkescriptton 
of that skill's task space. This task sp^se ic modeled as a three dimeisional 
(one rotational and two linear degrees of freedom) point attractor and i5 
illustrated in Figjre 8A. In this figure the terminal device >s an austract 
task -segment (m^ = mass, 1- - length) representing tSc grasped cup, with 
one end (the ^distal** end) defined as the point of final cupnaouth contact, 
and requiring three coordinates for its cosplete tasic ap&oe description. The 
target location (mouth ) for the segment 's distal ^d defines the origin 
(tQ.,tQ2) of a t^tp Cartesian coordinate sysett; a;;is t- is defined 
as a reach-axis from the initial position of the segment's dijtal end to the 
tjt2 origin; and axis t2 is defined orthogonally to t^. The orJenta-^ 
tion of the task segment relative to axis t^ defines the current angular 
t^ coordinate; tn^ defines the (idenW-cal) initial an<^ target task seg- 
ment orientations^ and Ij(-[1/3^mjl^ ) is the task segment 's moment 
of inertia about its distal end. The equations of motion) corresponding to 
axes t^, t2» and t^ are: 



where /o is a constant scaling factor with units of length and is used to en- 
sure dimensional homogeneity along all task space deg'-ees of freedom. Thus, 



m^t^ + **T1^ * **T1^1 " ° 



(14a) 



(Ife) 
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all teritts of equation 14^ even the rotational terms of 14ct are defined In 
units of force. For purposes* of the present paper» f> is set to t and 
consequently is oqltt^d for notatlonal simplicity In all further discussions 
In this section. In Figure 8A the stiffness and daniplng elements are 
represented In lumped form as squiggles In the lines connecting the task seg* 
ment to the linear Vest positions" and to the rotational Vest orientation." 
E<|uatlons 14 describe a set of uncoupled (by definition of the abstract task 
space) equations with constant task dynamic parameters and can be represented 
In matrix form as: 

V 

Hj» Bj» and fCp are 3x3 diagonal matrices of task dynamic parameters 
analogous to the sinqp^ler 2x2 point attractor system of equation 5. In a 
similar fashion^ the body spatial equation and the joint variable (task net- 
work) equation are simply the 3x3 analogs of equations 6 and 7. The corre- 
sponding body spatial and Joint variable^ representations are Illustrated in 
Figures 8B and 8C. 

When siMlatedt a typical movement generated by these task dynainlcs» us- 
Ing synvietrlcal task axis stiffnessess ^^ji^^Tp^n^ critical danc- 

ing along all task axes» ^ows both a straight Tlne'^trajectory and a main* 
talniid horizontal orientation of the task segment during the movement. 

3. Reaching (rhythmic) . The point attractor task space topologies used 
for the discrete reaching and cup^oHnouth tasks will be unable to generate 
the arm kinematics associated with sustained cyclic hand motion between two 
body spatial targets. Consider» for exaiq»let the case of planar oaotlon of the 
terminal device (hand) and a corresponding 2- Joint effector system (arm with 
shoulder and elbow Joints)* The task space is Illustrated in Figure 9A and 
consists of an orthogonal pair of axes (^itto) for which: a) tj is de- 
fined along the line between the two targets (Distance between the targets); 
and b) the origin is located midway between the two targets (A=D/2=distance 
from origin to either target) . The terminal device is an abstract point 
task-mass (mj±iDas!r)» and may be located anywhere In the task space. Point ^ 
attractor dynamics are defined along axis tg to bring the task mass onto 
axis t. and to maintain it there despite transient perturbations Introduced 
perpendicular to t^. Limit cycle (periodic attractor) dynamics are defined 
along axis t^ to sustain a cyclic motion of the task mass parallel to t^ 
between the two targets^ and to maintain the desired oscillation aiq)litude 
(A=D/2) despite perturbations introduced parallel to t^. The task space 
^ equations of motion are: 

mpt^ • '^Tl^l * *^T1*^*1^1 * ^1^1 " ^ 

fflptg + bj2^2 * ^2^2 " ^* where (16b) 

fflp, kj^» bj2* and k^^ are defined as in equation 5 (discrete reaching taskt 
point attraotor); and (-bj^t^ ^ cj^t^^t^) is the nonlinear escapement term 
(van der Pol type) for axis t^. 

The dynamic parameters for axis tg are tuned In the same manner as in 
the tg axis of^ the discrete reach task space (see earlier Task space sec- 
tion). Tuning the dynamic parameters along axis t^ involves specifying 
k^^ according to the desired period » P» of motion and the relation 
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Figure 9, Rhythmic reaching: A, Task space. Open circles represent targets, 
Squiggle represents point attractor (spring and damper) dynamics 



along axis t^^ Open box represents limit cycle (spring and van 

J. ..J - ^1 ^j^j^g , 



der Pol escapement ) 
C, Task network. 



dynamics along 



B, Body space; 
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/ 



superscripts denote differentiati on with 
ne variable, N=ai.n, with tOr.^ Jk-../m*' 
e -arlable; bX Jc^ b^^ t^ is the 
and c) 6 ^ b^-/|mj,Vji is a dlmen- 



P=2W/ Jk^^Tm^, The procedure for specifying b^^ and c^-j is more In- 
volved, and may be understood by considering equation 16d In the following 
normalized, dlmenslonless form: 

Z^** .6(Uz/)Z^* + ^ 0, where (17) y 

a) the single and double apostrophe superscripts denote differentiati on with 
respect to the dlmenslonless time 
and n denotes the standard time 

dlBbenslonless displacement variable; * "Tl/^"*T'^Tt ° aivKu- 

slonless measure directly related both to escapement ^strengpbh** (l*e*, the 
strength with which 'the system resfsts being displaced from the limit cycle) 
and the shape of the limit cycle orbit In the phase plane (e.g, corre- 
sponds to a circular orbit and sinusoidal motion; 6» 1 corresponds to a 
relaxation orbit and a step-like motion). Given values for k^^ (from the 
desired period) and ^ (from the desired orbit shape), b- -^u determined from 
the above expression f or £ , Flnally« it Is known that the amplitude for the 
normalized (Z-^arlable) system of equation 17 Is equal to 2,0 over, a wide 
range of ^ values (e,g,, 0<6<10, Jordan & Smith, 1977), For the original 
non-normalized ( t-varlab le) "sjfstem of e<)uatlon l6a« the corresponding ampll* 
tude Is A = 2 Jbj^/cjj, Therefore, given valuta for b-^ and^ desired am- 
plitude A« the value of c^^ Is determined from the preceding expression for 
A, Finally! equation 16 may b^ rewritten in matrix form as; 

M^t + Rft + K^t = Xt» "^^^^^ ^ (18) 

and| are defined as in equation 5; 



, denoting the linear damping coiq)onents; and 



F^s(-Cj^t^lt^, 0) , denoting nonlinear system components. 

Equations 16 and 18 represent an autonomous, uncoupledv task spatial 
dynamical syst^-^ with constant parameters. Figure 9B Illustrates how the task 
space is located and oriented in body (shoulder) space. The body spatial 
dynamical system is described by: 

"bJ * ^bS * =Ib* ''^^^^ 

Mg=M^R, , where R=the rotational transform matrix with elements 
r^j defined previously in equation 6; 

KgsK^R; and 

2 T 
Fg=:(*c^,(r^,Ax^ + ''12"2' ^^U^l * ''l2*2'' ' 

Equation 1 9 describes an autonomous , coupled (due to the rotation 
transformation), body spatial dynamical system with a constant set of linear 
parameters and a nonlinear, state-dependent forcing function. This body spa- 
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tial equation may be tranaformed kinendtically into Joint variable form by 
expreaaing 3^ variables as functions of the jtf variables of a corresponding sod* 
el arn (see Figure 9c): 

Mgjg + BgJji + K^^ii) . • HgVip, Where (20) 
'^Q* Bq« wd Kg are defined as in equation 19; 
J7j(jtf), the Jacobian matrix; 

' Z^i\* Sb ^^^^ substitutions Ax^ =^x^(|^), iiX2 = 

AX2(^)* x^ - ^11**!* ''i2**2* '^2"''21**1* *22**2' 

V and are as defined in equation 7. 
Equation 20 nay be rewritten in the following task network form: 




Finally, since the real arm's action can be described by the articulator net- 
work (% variables) equation 12, one sees that the articulator controls, 
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find %^ are specified according to control law3 IJa and 13b, f^eapectlvely. 
A coiqparlson of equations 12 and 21 sAws that, assuroing jtf=0 and jgrgi the 
articulator control Is defined according to the following control law: 

A typical Bovement generated bf these task-dynamics is illustrated In 
Figure 10, showing the iDotlon of the task mss In body (shoulder) space. Hote „ 
the straight line hand trajectory during the steady-state cyclic' motion be* 
tween targetSi and also the way the hand ^ Is attracted autonomously to tjils 
steady^tate trajectory despite a startup position (with zero velocity) away 
from this trajectory. 

Crank Turning . Figure 11C lUuflftrates the shoulder spatial layout of 
a crank turning task In which; a) motion of arm and crank occur In the 
horizontal plane; b) the crank sejsnent*s **distal** end is attached to a fixed 
rotation axis located at in shoulder space; c) the crank rotates at a 
constant angular velocity, v> about the fixed axis? d) the wrist Joint is 
fixed and the hand tl^tly grasps the crank^s handle, which freely rotates 
about an axis fixed to the crank^s **proxlmal*' end; and e) 6^ and 
resent'^'the shoulder and elbw* angles, respectively, while rf, represents ^he 
.angle between the hand-forearm and crank. The task space description Is 
Illustrated In Figure 11A In which: a) the crank Is the terminal device or 
task segment (m^smass, U^length); bO the fixed rotation axis at the 
crank^s distal end defines the origin of a Cartesian t.t2 coordlnat - sys- 
tem;- c) an angular t^ coo^^dlnate is defined bx the orientation of the crank 
relative to axis t. ; and d) 1/3)n^lj 13 the crank *s moment of 

Inertia about its distal end. The task spatial equations of motion are de- 
fined as: 

^23b) ' ^ 
(23c) 

f> is the same scaling factor used in equation 14c, and will be omitted from 
further discussions in this section ff>r notatlonal simplicity. 

> 

Equations 23a and 23b define point attractors whose corresponding damping 
and stlfftess factors are represented in lumped form in Figure 13a, and which 
serve to maintain the crank's distal end at the task 3pace origin. Since in 
the real world the crank is fixed to this axlSi these axes oay be weighted 
rather loosely (i.e. « they may be assigned low values for k-^ and kj2)- 
Equation 23c needs a bit more explanation as it contains a limit cycle's 
escapement term (Raylelgh type escapement: '^jjt^ ^ ^13^^ ^ ^ 
spring term. The behavior associated with equation 33c is besV Understood b/ 
examination of its corresponding phase portrait (Figure 12). Here it can be 
seen that there are three steady states represen ted by l ines parallel to the 
t^ axis. The lines defined by t^ = iV s -Jbj^/Cj^ are stable steady 
states* and the line t^ = 0 denotes an unstable steady state. In other 
wordSi given any nonzero startup velocity in either the upper or IcMer half 
plane* the system will rt^ach the corresponding positive or negative steady 
state angular velocity, ^V. If, however, the system^beglns at any angular 



m^'t^ + **T1^1 * '^Tl^l " ^ 

by 2^ 2 ^ ky ~ 
^I^tj -l>^jJt2 * f^T3^% ' ^* where 
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Figure t1. Crank turning: A. Task space. Squlggles represent point attrac- 
tor dynamics along linear axes and tj. Open box represents 
velocity attractor (Rayleigh escapement) dynamics along rotational 
axis t^; B. Body space; c. Task network. 
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Figure 12. Phase portrait of velocity attractor ayatem. 



2*2 - £(1-22^)22= 0, where . (2*») 

2^ = 4 c-^/bjg * t^ la the dlMnalonleaa dlaplace^ent variable, and 
b /I- la airecily related to the *'atreogth*'. of the escapement (l,e*» 
the speed with which the syatem attalna the, ateady ^ate and the atrength with 
which It realsta perturbatlona fron the ateady atate)« Since we are unaware 
of any other label for thla type of dynamical topologyi we will call It a 
btatable velocity attractor (or wore almplyt a velocity attractor )* Given a 
dealred eacapenent atrength (€) and final crank angular velocity CV)| the 
above relatlonahlps are sufficient to tune the systeji*s h-^ and o*^ valuea 
according to these task deftands* Equations 23 nay be rewritten In satrlx forn 
asS 

+ Bjt + K^1£ = £jt where (25) 
Hj Is defined as In equation 15; 
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* Equations 23 and 2f> represent an autonomous » uncoupled (by definition) 
task spatial dynamical system ^wlth constant parameters. Figure 11B shovs how 
the task space Is located and oriented In body (shoulder) sjf^tlal coordinates. 
Note .that the orientation of task-to-body space is arbitrary* and In Figure 
11B the orientation angle 0 Is simply assumed to be zero (e.g. / see Ptgqre 8B 
for an exanple of a different task with nonzero 0). Figure 1lC» as mentj^iusd 
prevloualy* shows the relatloT) of the task and body spaces to the tas)c*s '*mod- 
el** arm. Equations for body spatial* arm model* and task network oynamlcs may 
be derived from equation 25 In a manner similar to that us^d In generating 
equations 19» 20» m4 21, respectively* from equation 

It should be noted that the configuration of the model arm is specified - 
in exactly the same manner as our earlier exaooples. Angles (shoulder ) 
and 6^ (elbow) can be obtained **proprloceptlv^ly'* but the angle be- 

tween Lhe crank and the hand*forearm» cannot. However* assomlng that the lo- 
cation of the crank's distal end^ (environmentally fixed rotation axis) is 
known In body apace coordinates and given 6^ and 6^ proprloceptlvely* 
is uniquely specified by geometric considerations. Tnus the full d set la 
available for use in the control law computations. 

B. Im mediate Compensation \ 

In the Introduction , we re/lewed experimental data on speech movements 
that showed task-specific, automatic, cofl^enaatory response patterns in remote 
articulators to unpredlcted transient perturbations in a given articulator 
that were relatively immediate. These data implied that selective patterns of 
coupling or gating existed among the component articulators that were SH«clfic 
to the produced utterances. In tho context of the task dynamic approach, we 
hypothesize that these coupling patterns are due to the corresponding evolving 
patterns of artlcula tor-dynamic control parameters specified by task- and 
state-dependent control laws or equations of constraint. ^ 

To Illustrate, consider the following exaiq>le of a discrete reaching task 
(foraulated as a modified version of a cup^o-wuth task) In whlcii: a) the 
terminal device is a pointer fixed to the hand of a S-segment (upper arm, 
forearm, hwd-pointer ) arm; b) planar motion of the pointer corresponds to 
angular motions of the arm*s 3 Joints (d^sshoulder, ^2^^^' ti^-angle 
between pointer and forearm); and c) task demands focus on posltionfng the 
pointer 'a distal end at a body-spatial x.Xo target but are relatively 
Indifferent to the precision of final orientation control. Consequently, the 
task space may be described as a 3-dl(Qensional point attractor with syBiaetri- 
cal weightings for the linear t^ and t^ axes, and a much smaller weighting 
for the rotational t^ axis. Figure 13 Illustrates the initial (a) and final 
(b) arm oonfiguratiotft that correspond to the current task dynamic? (weighting 
ratio of axes t^ and t^ to t^ is 20; 1) when the arm encounters no 
perturbations en route to l.ts body spatial pointer target. The initial arm 
conflguratlPn is Jg* * {79**» 20**, ;71**) and the final arm configuration 
0^ s <115^t 81^, 7y*)T. Figure 13 (configuration c) shows the final arm 
position when the shoulder angle Is suddenly braked during the trajectory when 
it reaches 105^ and is held fixed at this angle. The initial 0. la the 
same u In the unperturbed case and the pointer*s distal end reaches precisely 
the same spatial 'i.xo target as in ^he unperturbed motlom despite the 
fact that the final configuration has changed to gj. r (lOSr, 95 , 52**)*. 
In other words, the system's r^esponse to the perturbatfon was to **automatical- 
ly* redistribtite ^the activity among its coiqponent degrees of freedom in a 
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Figure 13. Arm configurations for simulated discrete, reaches showing: 
a. Initial posture; b. Final posture (unperturbed trajectory ) ; 
c. Final posture (perturbed trajectory). 



mp 

KNEE 
(SWAY) ANKLE 



500msec 




HIP 
KNEE 
ANKLE 
SWAY 




i 



t 



Figure 14. Desciiptlon of basic postural perturbation paradigm of Nashiier and 
colleagues, showing four types of perturbation (right column) and 
corresponding leg joirtt angular rotations (left column) ; A. AP 
translation; B. Direct rotation; C. Synchronous vertical; 
D. Reciprocal vertical. (Adapted fron Nashner i Woolacott, 1979.) 
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manner that still achieved the same task spatial goal. Furthermore, such 
coirpensatory motor equivalence reflects the fact that targets in the task dy- 
namic approach are not sp.ecifig d as final articulator configurati o ns, but 
rather as desired final spatial coordinates for the terminal devicer^iie fi- 
nal articulator configuration "falls out" of the task dynamic organization and 
the environmental conditions in whichHhe movement is performed. 

yi ) Relevance to Physiological Literature 

In this section, we will describe how the task dynamic approach might ap- 
ply to the issue of postural control in humans, and suggest a "systemic" al- 
ternative to the modular synergy model of Hashner (e. g. , Nashr.er, l98l ; 
Hashner & Woolacott, 1979) to account for postural compensatory phenomena. 
Further, we will reviw-evldence-frcm^-studi e a of alngl&sjolnt-iiiscrete move-- 
ment tasks (e.g., Bizzi, Chappie, & Hogan, 1982) that show that the physiolog- 
ically relevant parineter <if rest angle (i.e., the angle specified by the 
equilibrium point between agonist and antagonist length-tension curves) is 
actively specified during such tasks as a gradually (as opposed to step-like) 
changing central control signal. In the control law version of task dynamics, 
there is no articulator-dynamic control parameter corresponding to rest angle. 
However, a network coupling version of task dynamics, now in preliminary form, 
will be described that includes rest angle as a parameter and provides a ra- 
tional account for the evolution of the >-est angle's trajectory without 
^requiring an explicitly preplanned trajectory representation to account for 
the observed pattern. Finally we will discuss the Implications of the network 
coupling approach for theories of learned complex skilled actions. 

A. Postural control 

Nashner and his colleagues have performed an elegant series of expei-i- 
ments on postural responses to support surface perturbations in standing human 
SMbjects. SuTimarizing from the experimental report of Hashner, Woolacott, and 
Tuma (1979) and several subseqL*»nt reviews (Nashner, 1979; Kashner, 1981; 
Kashrer & Woolacott, 1979), we, can describe the paradigm and findings in the 
following way. Basically, a subject stands 'with each foot on a separate 
horizontal platform that can be translated horizontally, translated vertical- 
ly, or rotated about at axis aligned with/ the ankle joint. Using these 
platforms, one or a combination of the following four types of perturbation 
could be delivered to the , subjects- on /a given trial (Figur? 14): a) 
simultaneous forward or backward anteroposterior translation (AP translation); 
b) simultaneous flexion or ex.tension .i-otations (direct rotation), c) 
simultaneoutf upward or downward vertical -translation (synchronous vertical); 
and d) nwiprocal vertical translation ( reciprocal vertical ). These perturba- 
tion tjbes may be characterized by the .-corresponding patterns of whole body 
motions)and joint rotations that would be induced in "passive" noncompensating 
^i.hjpots (Figure 14). Thus. AP translation cauaed the body to 1^^" ^^^^ 
direction opposite to the translation; direct rotation causeo tne ooay x-o 
tilt in the same sense as the rotation; synchronous vertical caused the body 
to move with thr translation; anfl reciprocal vertical caused the body to tilt 
laterally toward the lowertng platform. It should be noted that the first 
three perturbation types induce motions in the sagittal plane, while the 
reciprocal vertical type induces motion in the frontal plane. 
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In response to each perturbation type or typ< combination* Nashner et 
dl* measured EMG responses from the upper and lower leg muscles^ as well as 
changes In ankle^ knee^ and hip angles. Associated with each perturbation 

-type was — a — i^ng lafrAw/^y r^.p. ^ 100*1 jn_^ latency in gastronemius) **rapld 

postural adjustment** (Nashner > 198I ), whlcfi coinprlsed the earliest useful 
postural response* while the sho^'ter latency myotatlc reflexes were either ab-* 
sent or of no apparent functlbn^l value* These rapid postural adjustroents for 
a given type a) were characterized by fixed ratios of activity among the 
responding muscles* b) were specific to the perturbation type (and the corre-* 
spondlng type-speclflc patterns of Joint displacements)^ and c) were **func-* 
tlonally related to the task of coordinating one kind of postural adjustment** 
(Nashner* 197 * p* 179). Further* during a set of trials In which a sequence 
of either of three perturbation types (AP translation* synchronous vertical* 
or reciprocal vertical) was unexpectedly and Immediately followed by a se-^ 
quence of one of the other two types* It was found that the functionally ap-* 
proprlate postural synergy response occurred even on tt^e first trial of the 
new type* Such **flrst trial adaptation** did not occur* however* when a se-* 
quence of AP translations was followed imedlately by f unexpected series of 
direct rotations (or vice versa). In these cases* tht jnctlonally correct 
(I.e. * posturally stabilizing) synergistic response pati^ern was Implemented 
progressively over a series 4f approximately three to five trials. Addition-* 
ally (Horak & Nashner* 1983)* If a series of AP trials with the subject stand-* 
Ing directly on the footplates was followed by a series of AP trials with the 
feet resting on narrow transverse be^ms^ the subjects switched from a postural 
response Involving predominantly ankle motions (ankle strategy) to one Involv-* 
Ing predominantly hip ^notions (hip strategy). This strategy change was Imple-^ 
mented progressively over the course of approximately 5*20 trials* and this 
multl trial adaptation process was also seen for the reverse change from beam 
to footplate postural strategies. 

Nashner and his colleagues have Interpreted these data as being consist^ 
ent with a modular synergy '*conceptual model for the organization of postural 
^adjustments** e.g** Nashner* 1979* 1981; Nashner & Woolacott* 1979). Al- 
though admittedly In preliminary form* this hierarchical model proposes that 
postural synergies are organized splnally as separate modular function genera-* 
tors, and are automatically triggered by correspondingly appropriate dlstlnc-* 
tlve features of somatosensory (l*e. * proprioceptive information related to 
joint angular rotations) Inputs. Thus» for example^ the AP sway synergy mod-* 
ule is activated in proportion to ankle rotational input* while the vertical 
suspensory synergy module is activated in proportion to knee rotational input, 
and inhibition of the sway module by the suspensory module is provided to pre- 
vent simultaneous activation of both synergies. Such a system provides a rea-* 
sonable account of the automatic first trial postural responses described 
above* Additionally* supraspinal processes are assumed to modulate the in-* 
put-ou\.put relationships of the peripheral synergy modules in order to mdln-* 
tain postural stability using posturally relevant knowledge of results (e.g.* 
sensory conflict between somatosensory and vestibular sources of Information 
concerning the body^s orientation relative to the-support l^se and the line of 
gravity). Such suprasplnally controlled modulation effects are presumed to 
occur relatively slowly^ and are posited to underly the multltrlal postural 
adaptation phenomena described above. 

Task dynamics offers an attractive alternative to this hierarchical modu-* 
lar synergy approach. In the modular approach ^ synergies are canonlcally 
represented as stored output patterns* and are triggered by corresponding dls-* 
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tlnctlve features of somatosensory Inputs. When the problem of postural con- 
trol Is fomulated In task dynamic terns* however^ synergies need not be 
cBnonlcally represented anywhere; rather, synergistic patterns of muscle 
activity may be viewed as emergent properties of the task flynamlcally organ- 
ized postural system* In this latter vlew» one may define a po sturjal task 
space (see Figure 15A) in the following way» using postural control only in 
the sagittal plane for purposes of illustrative simplicity. This task space 
Is modeled as a two-dimensional point attractor for which: a) the terminal 
device Is the body's center of mass and Is represented as a point mass with 
mass m- equal to total body mass (note that, unlike earlier examples^ this 
terminal device cannot in general be associated with a particular point on the 
linkage); b) axis t^ is defined parallel to the line of gravity and axis 
t^ is defined normal to t2; the t.t2 origin is defined by the target 
location of the mass center, which coincides with the mass center's initial 
location (assuming a corresponding posturally stable initial body configura-^ 
tlon). The task space equations of motions are: 

mjt^ + bj^t^ + kj^t^ 5 0 (26a) 

mjt2 + b 0. where (26b) 

the damping and stiffness parameters define point attractor topologies along 
each task axis. Gravity does not appear explicitly in (26b) since t2 
denotes displacement from the statically stable vertical position of the tasE 
mass in the gravrtatlonfi field. In other words, tg = to* 
- (m-g/kjgJi where tg* corresponds to the statically stable ^'vertical* 
position of the task mass in the absence of gravity, and g denotes the 
acceleration due to gravity. In matrix form these equations become: 

H^l + BjJ + = 0 (27) 



The pattern of task spatial dynamic parameters in (2?) may be transformed 
into body spatial form with reference to a coordinate system whose origin 
coincides with the center of the support base. The spatial^ relationships be-^ 
tween task space and body space are Illustrated in Figure' 15B in which; a) 
the x^ axis is defined along the anteroposterior line between the rear and 
front edges (denoted by open squares) of the support base, which is defined by 
the contact areas bt^tween the feet and ground surface; b) the X2 axis is 
defined normal to x. %t the midpoint of the support base; c) the relative 
orientation between iaak and body space is defined by the angle 0; and d) the 
location of the task space origin in body space coordinates is defined by 
5 . It should' be noted that both V and JS^ are defined by' the current 
postural conf lguratlon» which is assumed to be statically stable^ i.e.* the 
projection of the initial location of the center of mass (task space origin) 
along the line of gravity will fall within the bout.1arles of the support baoe* 
In this regard, the task dynamic approach to vertical posture control is simi- 
lar to the model proposed by Lltvlntsev (1972)» who stated that it is likely 
that **the es5entlal role in equilibrium maintenance is ^.layed by a mechanism 
which organizes muscular control at the various Jouts by parameters 
characterizing the general body position. .. (p. 590),^ and that ^the magnitude 
and the rate of deviation of the weight center projection on the support plane 
are input parameters for this mechanism (p. 598).^ Finally, it shoMld be noted 
that: a) in most dally activities we stand cn horizontal surfaces and 0 
thereby usually assumes a value of zerc (Figure 15C); and b) the body spatial 
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Figure 15. Postural maintenance task: A. Task space; B. ^ody space. Open 
boxes represent front and back edges of support base (feet). 
Orientation angle, (B, bet**een task and body space is nonzero; 
Z. Body space. S is zero, representing parallel orientation of 
t^ and x^j D. Po3tural effector system* 
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equations of motion derived from (27) have the ^ame form as equation (6) In 
our earlier discrete reaching example. 

The body spatial pattern of dynamic parameters may be transformed Into an 
equivalent task network expression based on the joint variables of a (slmpU^ 
fled) four-segment (foot, shank, thigh, torso), three-joint (ankle, knee, hip) 
effector systea (Plgyre 15D). This task network equation has the same general 
form as the discrete reaching equation (8), except that the postural task net- 
work Involves three Joints (not two' joints), and two spatial variables, defin- 
ing thereby a redundant task-artloulator situation (see footnote 7) |tnd hence 
requiring the use of the Jacoblan pseudolnverse, J% or weighted 
pseudolnverse, J*. 

In the task dynamic framework, it is evident that consistent synergistic 
patterns of postural responses will occur in response to given types of 
destabilizing Inputs. If a task network Is ~ established according to an 
accurate evaluation of th:^ spatial relationships between task and body space, 
these postural responses will be stabilizing and compensatory. Further, they 
will be ^lined lately^ accurate since they depend only upon the current liirib 
state and the (accurately tuned) task network. In other words, synergistic 
responses emerge from the (tuned) postural system^s underlying task dynamic 
organization; there Is no need to Invoke the notion of access to and trigger- 
ing of stored canonical synergy output programs. However, the postural system 
can be fooled Into establishing an lBq>roperly tuned task network based either 
on an Inappropriate evaluation of the task-body space geometric relationship, 
or on the use of an Inappropriate %relghtlng strategy for the joints in the 
(redundant) postural effector system. In the former case, for example, a se- 
ries of trials Involving AP translation perturbations requires tuning 9 0^ 
since the support base Is horizontal throughout the trials. If a direct rota- 
tion perturbation Is unexpectedly Introduced, this setting Is no longer valid 
and the task network will shape postural responses that are Inappropriate and 
destabilizing for the new task-body space geometry. Adaptive responses to di- 
rect rotation perturbations require setting 9^4^ (where 6^ ^ ankle an- 
gle) In order to tune the task network appropriately. Apparently, this sort 
of retunlng process does not occur Instaptaneously, b'Jt requires 3-5 trials as 
discussed earll^er. 



Iri the case of tunings related to effector system weighting strategies. 
It appears that an efficient strategy for dealing with AP translation pertur- 
bations of the foot plates Is an ankle-predominant one when the feet rest 
directly on the plates, but a hlp-predomlnant one when the feet rest on narrow 
beams. These strategies would serve to tune differentially the weigh-lHg 
matrices for the task network (via the weighted pseudolnverse, J*) according 
to the current support surface configuration. If the support surface context 
in changed, say, from plate to beam support i ^hen the ankle^elghted J* used 
for the plat<? context will be Inappropriate for (or less efficient than) the 
new beam context. Apparently, adaptlvely retunlng J* to reflect a hip predom* 
Inant strategy (and vice versa for hip to ankle strategy retunlng) requires 
approximately 5-20 trials as discussed ear^lert 

B. Rest an^le trajectories; Net%iork coupling 

gg,3t angles: final position control j trajectory formation . It was 
noted above (in the Topology and Dynamics section), that discrete target 
acquisition tasks In one degree of freedom systems (etg., at the elbow joint) 
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were observed to display properties homologous to dao^ed mass*sprlng systems 
by several Investigators (e.g. » Cooke» I98O; Pel 'dman» 1966;. Kelso» 1977; 
Pollt & Blzzl, 1978; Schmidt & HcGoun, 1980) and had been modeled^ essential- 
ly* as point attractora In an articulator dynamic sen3e» requiring only the 
setting of the final or target rest angle parameter (but see footnote 4). 
According to the ao-called " final Position con t r Q l'* hypothesis (e.g., Blzzi, 
Accornerot Chapple» & Hogan» 1981; Kelso & Holt» 1980; Sakltt, 1980), the 
relative levels of neural activation of vhe spring-like agonist and antagonist 
mtscle groups at a Joint 

define an equilibriuio point between two opposing length-tension 
curves and consequently a Joint angle. It has been suggested that 
the transition fron a given position to another may occur whenever 
the CNS (central nervous system) generates a signal shifting the 
equilibrium point t>etween the two muscles by selecting a n^ pair of 
length-tension curves (Bizzi et al., 1981)* 

According to this schema, movements are» at the silkiest level » 
transitions in posture* This simple idea id attractive because the 
details of the ptevement trajectory will be determined by the'^iner- 
tial and visco-^lastic properties of muscles and ligaments around 
the joint (ibid)* 

However^ as we discussec! above (Section m)» such an articulator-dynamic 
^ control scheme breaks down when more conplex multijoint tasks are considered 
(see also footnote 4). Further, even for single degree of freedom positioning 
tasks, the final position control hypothesis may be incoof^lete* Bizzi and his 
colleagues (Bizzi & Abend* 1982; Bizzi et al.t 1981; Bizzi, AccornerOt Chap- 
ple» & Hogan, 1982; Bizzi» Chappie, & Hogan» 1982), for example, have sug- 
gested that the rest angle trajectory is controlled in addition to final posi- 
tion* Thus, the final position control hypothesis predicts that tCtbow move^ 
ments result from rapid shifts to target equilibrium points and that, 
consequently, steady state equilibrium positions would be aohieved after a de- 
lay from' muscle activity onset due solely to the dynamics of muscle activa- 
tion. Bizzi, Chappie, and Hogan (1982) offer a ''slowest case** approximation 
of ISO ms for the time taken by the net muscle force to rise within a few per- 
cent of its final value. In fact^ however, these investigators showed that 
for movements of at least 600 ms in duration, the mechanical expression of h1- 
ph3 motoneuronpl activity reached steady state only after at least 400 ms had 
passed following the onset of nuscle activity* Consequently, it appears that 
the centrally generated rest angle signal gradually changes during the move- 
ment, even in deafferented monkeys, such that the alpha motoneuronal activity 
defines series of equilibrium positions, which constitute a trajectory 
whose end point is the desired final position** (ibid). Finally, it should be 
noted that Bizzi et al# (^981) interpret their observations as implying the 
existence of trajectory plans or programs to account for the observed time 
courses of rest angle movement as well as the final rest angle position. 

The control Ipi version of task dynamics is unable to account for these 
data for two reasons. First, there is no parameter corresponding to rest an- 
gle in the single degree of freedom case or rest configuration in the mul- 
ti-degree of freedom case. Second, the control law version assumes that £ and 
£ (real arm state) are perceived proprioceptively, that 6 and ^ (model arm 
state) equal the real arm's state, and that control laws are specified accord" 
ing to the currently perceived real arm^s state, tn the '*deaf ferented*^ case, 
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In which the current 0 and 0 are unavailable* the control laws are undefined 
and (coordinated) motion is not possible. Given the above ^'trajectory forma* 
tion** data of Bizzi and colleagues* if task dynamics is to t>e applied in these 
situations* the control law version must be amendM to generate coordinated 
movements in deafferented preparations and to include a rest configuration pa* 
raneter (which* of course* must evolve autonomously during the movement 
> according to task-dynamic constraints). Although in preliminary form* we be* 

lieve a network coupling version of task dynamics satisfies these requirements 
and provides a more biologically plausible task dynamic account of skilled 
movements. 



2. network Coupling . The network coupling method (outlined in Figure 
16) involves shaping articulator dynamics according to task*specific dynamical 
constraints and may closely approximate a biological style of coordination and 
regulation. Briefly* the network coupling method involves interpreting the 
observed skilled motion of an^effector sys&em to be the observable ^'output*' of 
an articulator network that coflprisifS* however* only one half of a task 
specific action system. The complete action system consists of ^he mutually 
or bidirectionally coupled task^ (output variables: jj* ^* etc.) and articu- 
lator (output variables: J^, &* etc.) networks. Thus* for the miltidegree 
of freedom discrete reaching task described earlier* this method involves: a) 
treating the task network defined in equation 8 as a system for intrinsic pat* 
tern generation that is specified for a given task and actor-environment con- 
text* and that does not require peripheral input for its operation; b) defin- 
ing the articulator network corresponding to an actual arm by the following 
version of equation (11): 

!^ * * ifM * if,h(s * li'A, - a (28) 
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Figure 16. Overview of information flow in network coupling version of task 
dynamics. 
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and c) using the task network to both actuate and modulate the articulator 
network, while using the articulator network to modulate the task network. 

More specifically V the network coupling method begins by using the out* 
put of the task network as the jj^ input ("rest configuration," i.e., g^j= ri) 
for the articulator network. However, since the task and articulator networks 
are potentially independent, one cannot simply assume identical task and arm 
network states (as in the control law approach). Rather, ' '«^e make the less 
stringent assuoiption .that real arm and model arm states are "close,** i.e., 
that g - J{ * 4* and g - 4 = ^ are "small." Therefore, the constraint rela- 
tionships for B^, K^^ and defined in a more approximate sense 
than those in equation (13) (see Appendix C for details): 

•^A = t"A''"^"BV3lg^=jg (29b) 
X^,= [H,rW^3|g^!^«p (29c) 

These sets of "driving** constraints (8^^=!^) and "modulating" constraints (equa- 
tions 29) comprise the "efferent" aspect of our coup led -Hi^twork action system. 
With these constraints, the articulator network becomes (statically) stable 
about the current rest configuration, with stiffness and damping properties 
defined relative to task space axis directions* 

However, a coup led Mietwork action system involves bi-directional coupling 
and hence an "afferent" aspect as well* This pattern of afferentation serves 
to modulate the activity of the task network on the basis of ^^oth relative an- 
gular displacement - jj-g) and relative angular velocity (^ - jg ft) oou- 
pling terms defined by ^c^S and fi^t respectively! where tc and are constant 
scalar coupling coefficients. This type of coupling, which is proportional to 
differences between corresponding sets of state variables* Is called diffusive 
coupling (e*g» {land & Holmes, 1980). The modulated task network is then ^e- 
.^Jcrit^d by the following amended version of equation (8): 



where by assupq)tion ^ and ^ are assumed "small." The effects of these cou* 
pling terms on system behavior are to reduce the size of -40) via 

coupling and to reduce the size of ^ via coupling* thereby promoting an 
in-phase (vs. anti-phase) one**to-one relationship between real and model arm 
motions. It shoulf' be noted that equation (30) reverts to equation (8) when 
^ and 4^ equal zero (i.e.f there is perfect mutual tracking of the real and 
model arms) or when the afferent coupling is disengaged (i*e., peripheral 
feedback is eliminated and the system is "deafferented") by setting o( and to 
zero* Further, one should note that, even when deafferented* the model arm is 
governed by the task network equation (30) and hence £ Is capable of 
coordinated (although probably degraded) motion due to "internal feedback" of 
the model arm's current state within the task network. Here, internal feed- 
back is used in the sense of Evarts (1971) to indicate information "arising 
from structures within the nervous system" as opposed to peripheral informa* 
tion from proprioceptive sources in the (real) limbs. Finally, although the 
operation of t^e coupled action system involves regulating 4rf(=-A0) and 
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to be ^•small," thla Is not solely a function of JH^^^^^^tlon control 
requirements per ae but also serves to validate the ^'small** relative displace- 
ment and velocity aaautnptlons used for the real arm control matrices specified 
In relationship (29). 

In summary* the network coupling version of task dynamics may provide a 
more biologically relevant sensorimotor control scheme than does the control 
law version. For single degree of freedom positioning tasks* It provides a 
rational account of the centrally specified rest angle's trajectory for these 
tasks without needing to invoke an explicitly preplanned representation of 
that trajectory. Rather » the rest angle trajectory evolves * even In the 
deaf ferent€<) case* as an ongoing function of the underlying task dynamics. 
Slmllariy* when applied to discrete planar reaching tasks of 2-Jolnt arms» the 
rest configuration trajectory will evolve so that the hand should move In a 
quasl-stralght-llne from Initial to final position. Finally^ when applied to 
cyclic spatial movements of a multljolnt arm» the network coupling approach 
shares certain features with recent work on locomotion (cf. Grlllner* 1981* 
for review). Investigators in this field assume the existence of Innate* 
endogenous* cellular networks that are: a) capable of driving the limbs 
according to tlie locomotor task without requiring peripheral Information; yet 
b) can be mo<fulated«-ln phase dependent ways«by this same peripheral Input 
(e.g.* Forssberg» Grlllner* & Rosslgnol* 1975). Task networks may be 
Interpreted as the abstract* learned analogs of such concretely defined* 
Innate networks. Thus* from a task dynamic perspective* the origins of task 
networks lie In the active discovery and specification processes that occur 
during skill learning. Once acquired* their operation is tailored to (tuned 
by) cur.-ently perceived task demands and the actor-environment spatial con* 
text. 

References 

AbbSv J. H.* & Gracco* V. L. (in press). Control of complex motor gestures: 
Orofacial muscle responses to load perturbations of the Hp during 
speech. Journal ^f Meur^hjslology . 

Abraham* R. H.* & Shaw^ C. D. (l982). DynamlcS"The geometry Qf behavior . 
Sant$ Cruz* CA: Aerial Press. 

Benatl» H.* Gagllo» S.» Norasso» P.* Tagllasco» & Zaccarli;» R. (1980). 
Anthropomorphic robotics. I. Representing meohanloal ccunplexlty. Bio* 
logical Cybernetics * ^* 125-140. 

Blzzi* E. fl 980i. Central and peripheral mechanisms In motor control. In 
G. E. Stelmach & J. Requln (Eds. )» T utorials In npt^p r. behavior . Amster- 
dam: North-tlolland. 

Blzzl* E.» & Abend* W. (l982). Posture control and trajectory formation In 
single and multiple Joint arm movements. In J. E. Desmedt (Ed.)* Brain 
and spinal mechanisms of movement control In man: Mew developments and 
clinical applications. New York: Raven Press. 

Blzzl* E.» AccomerOt N** Chapple» ti.* & Hogan* N, (1981). Processes und^r* 
lying arm trajectory formation. In C, AJmone-Marsan & 0. Pompelano 
(Eds.)» Brain mechanisms of perceptual awareness and purposeful be havior 
(IB R O Monograph Series * pp. 311-318). Mew York: Raven Press. 

Blzzl» E** Accornerot N.* Chappie* W.* & Hogan» N, (1982)* Arm trajectory 
formation In monkeys. Experimental Brain Research * ^* 139-1^. 

Blzzl» E.* Chappie, W.» & Hogan* N. (1982). Mechanical properties of mus* 
oles: Implications for motor control. Trends In Meurosclences * 5* 
395-398. 



Saltznan & Kelso: Skilled Actions: A Task Dynamic Approach 



Cooke* J. D. ( 1980). The organization of siiq>le» skilled navements. In 
G. E. stelmach 4 J. Requin (Eds.), Tutorials in motor behavior . Amster- 
dam: North Holland. 

Pel^tizky, J. (1982). Final position control in siiiyilated Planar horizontal 
arm movements. Unpublished doctoral dissertation, Massachusetts Insti- 
tute of Technology, ^ Department of Electrical Engineering and Computer 
Science. 

Porf, R. C. (1974). Hodern o^rttrol systems (2nd ed.). Reading, HA: Addi- 
son-Wesley. 

Evarts, E. V, (1971). Feedback and corollary discharge: A merging of the 
concepts. In Central cflntrql gf moyefferft . Neurpsciences Resear ch Pro- 
gram Bulletin , 9» 86-112. 

Fel^dman, A. (j. (197^6). Functional tuning of the nervous system with control 
of movement or maintenance of a steady posture. III. Hechanographic 
analysis of execution by man of the simplest motor tasks. Biophysics, 
11, 766-775. 

Fel^dmrm, A. G., & Latash, H. L. (1982). Interaction of afferent and effer- 
ent signals underlying joint position sense: Empirical and theoreti<^<il 
approaches. Journal of Motor Behavior, J^, 17**-193. 

Folkins, J. W., & AbbSt J. H. (1975). Lip and jaw motor control during 
speech: Responses to resistive loading of the Jaw. Journal of Speech 
ar4 Hearing ^ >search , J[8, 207-220. 

Forssberg, H., CrTllner, S., & Rossignol, S. (l975). Phase dependent reflex 
reversal during walking in chronic spinal cats. Brain Research , 55, 
247-304. 

Fwler, C. (1977). Timing control in speec h pr-^ duction. Bloomington, IN: 
Indiana University Linguistics Club. 

Fowler, C. A., flubin^ P., Remez, R. E., i Turvey, T. (1980). Implications 
for speech production of a general theory of action. In B. Butterworth 
(Ed.), Language production . New York: Academic Press. 

Georgopoolos, A. P., Kalaska, ^ f., 4 Hasseyt J. T. (1981). Spatial trajec- 
tories and reaction times of aimed movements: Effects of practice, un- 
certainty, and change in target location. Journal of Neurophysiology t 
46, 725-743. 

Greene, P. H. (1V71). Introduction. In I. M. Celfand, V. S. (^rfinkel, 
S. V. Fom^n, & M. L. Taetlin (Eds.), Hodel6 ^ th e structural-functional 
organization of cer tain biological systems . Cari>ridge, KA: MIT Press. 

Gril^ner, S. (l^fBI). Control of locomotion in bipeds, tetrapods, and fish, 
xp J. M. BrooKhart 4 V. B. Mountcastle (Eds.), Handbook of physiology t 
s otipn Jl ThK^ rt^rypuy system; vol. Hi Motor c ^)ptrol> part 2 
(pp. 1179-123^). Bethesda, HO: American Physiological Society. 

Grillner, S. 0982). Possible analogies in the control of innate motor acts 
and the production of sound in speech. In S. Grillner, B. Lindblom, 
J. Lubker, & A. Persson (Eds.), Speech motor control . Oxford: Pergamon 
Press. 

Hebb, D. 0. (1949). The orga. iization of behavior . New York: Wiley. 

Hogan, N. < 19S3) . Heoh^snicsl imp^ance control in assisti ve devices and 
manipulators. In Proceedi ngs of the J oint Automatic Control Conferences , 
San Francisco, Vol. 1, TA*1GS. 

Hogan, N., & Cotter, S. L. (198^). Cartesian iav>edance control of a nonline- 
ar manipulator. In tf. J. Book (Ed.), Bobotics research and advanced 
applica t ions (pp. 121*VC)< New York: ASHE. 

So 

£r|c «2 



Saltznan & Kelso: Skilled Actions: A Task Dynamic Approach 



Hollerbach, J. H. (1982). Computers, brains, and the control of movement. 
Trends In Heurosclences , 5^, 189*192. 

Hollerbach, J. H. , & Flash, T.* (l98l). Dynamic Interactions between 11 nib 
segments during planar arm movement (AIH^635). Boston: Massachusetts 
Institute of Technology, Artificial Intelligence Laboratory. 

Horak, F., & Mashner, L. (1983). Two distinct strategies for stance posture 
control: Adaptation to altered support surface configuration. Society 
of Heurosclence Abstracts , 9. 

Ito, H. (1982). (}uestlons In modeling the cerebellum. Journal of Theoreti- 
cal Biology , 99, 81-86. 

Jordan, D. W., & Smith, P. (1977). Honllnear ordinary differential egua- 
tlons . Oxford: Claren^don Press. 

Kelsoi J. A. S. (1977). Motor control mechanisms underlying human movement 
reproduction. Journal of Experimental Ps ychology : Hunar) Perception and 
Performance , ^, 529-543. 

Kelso, J. A. S. (1981). Contrasting perspectives on order and regulation in 
movement. In J. Long & A. Baddeley (Eds.), Attention and performance 
(IX). Hillsdale, NJ: Erlbaum. 
• Kelso, J. A. S., & Holt, K. G. (1980). Exploring a vibratory systems analy- 

sis of human movement production. Journa. . of Heurophyslology ^ ^3 > 
1183-1196. 

Kelso, J. A. S., Holt, K. G., Kugler, P. N.» & Turvey, H. T. (l980). On the 
concept of coordlnatlve structures as dlsslpatlve structures; II. 
Einplrlcal lines of convergence. In G. E. Stelmach & J. Hequln (Eds.), 
Tutorials In motor behavior (pp. **9-70). riew York: North-4Jolland. 

Kelso, J. A. s.» Holt, K. G., Rubln» P., & Kugler, P. N. (1981). Patterns of 
human Interlinb coordlriatlon emerge from the properties of nonlinear Imlt 
cycle oscillatory processes: Theory and data. Journal of Motor Behav* 
lor, 13, 226-261. 

Kelso, J. A. S.» & Saltzman, E. L. (1982). Motor control: Which themes do 

we orchestrate? The Behavioral and Brain Sciences , 5^, 554-557. 
Kelso, J. A. S., Southard, D. L., ( Goodm3n» D. (1979).^ On the coordination 

of two-handed movements. Jou>^n a_l of Experimental Psychology: Human 

Perception ap .d Performance , 229-238. 
Kelso, J. A. S., Tuller, B., Fowler, C. A. (1982). The functional 

specificity of artlculatory control and coordination. Journal o f the 

Acoustical Society of America , 72, SI 03. 
Kelso, J. A. S., Tuller, B. H., & Harris, K. S. (l983). A 'dynamic pattern' 

perspective on the control and coordination of movement. In 

P. MacNellage (Ed.), The production of s peech . New York: Sprlnger-Ver- 

lag. 

Klein, C. A., & Huang, C.-H. (1983). Review of pseudolnverse control for use 
with klnematlcally redurdant manipulators. IEEE Transactions on. Systems, 
Man, and Cybernetics , SMC-13 , 2^5-250. 

Kugler, P. N., Kelso, J. A. S., & Turvey, M. T. (1980). On the concept of 
coordlnatlve structures as dlsslpatlve structures: I. Theoretical lines 
of convergence. In G. E. Stelmach & J. Requln (Eds.), Tutorials ^ I n motor 
b ehavior (pp. 3-*l7). New York: North-Holland. 

Lashley, K. S. (l93o). basic neural mechanisms In behavior. Ps ychological 
Revjlew , 37, 1-2U. 

Lltvlntr V, A. I. ((972). Vertical posture control mechanisms In man. Auto- 
mation and Remote Control , 33, 590-600. \_ 

Mason, M. T. (l98l). Cootpllance and force control for computer controlled 
manipulators. lEjEE Transactions on Systems » Man, and Cybernetics , 
SMC-11 , 1 18-^132. 



ERIC 



43 

5i 



Saltzman & Kelao: Skilled Actlonat A Task Dynamic Approach 



HcGhee» R. B*» & l5wandhl» G. I. (1979). Adaptive locomotion of a mul- 
^ tl*legged robot over rough terrain. IEEE Transactions Qn Systems > Man^ 

and Cyber ,.etlc3 > SHC-9 > 176-182* 
Mlnoraky^ K* (1962). Nonlinear o riclllatlons .- Princeton^ KJ: Van Kostrand* 
Hora3so» P* (I98l). Spatial control of arm movements. Experimental Brain 

Research > «2> 223-227* 
Kashner» L. H* (1979)* Or^nlzatlon and programuilng of motor activity during 

posture control. In R* Granlt 4 0* Pompelano (Ed3*}» Reflex control of 

Poature a n d movement ( Progreaa In. Brain Research » Vol* 50 » pp. 177-^8477 

New York: Elaevler/Korth-Holland Biomedical Preaa. 
Ka3hner» L* H* (1981)* Analysis of stance posture In humans* In A* L* Toue 

4 E* S* Luschel (Eds*)^ Handbook gf behavioral neurobiology : Vpl* 5 : 

Kotor coordination (pp* 527-565)* Hew York: Plenum* 
Hashner^ U M*^ & Woolacott» M* (1979)* The organization of rapid postural 

adjustments of standing humans: An experimental-conceptual model* In 

R. E* Talbott & D* R, K;ii0phrey (Eds*)» Posture and movement 

(pp* 243^257)* New York: Raven Press* 
Nashner^ U H*» Woolacott» N.» & Tuma» G* (1979)* Organization of rapid re- 
sponses to postural and looomotor-llke perturbations of standlt^ man* 

Experimental Brain Research > j[6> **63*-«76* 
Pelllonlsz* A*> 4 Lllnas» R* rT979T* Brain modeling by tensor Jfietwork theory 

and coqputer slnulatlon* The cerebellum: Distributed processor for pre-^ 

dlctlve coordination* Neurosclence » 323 -3*^8* 
Polity A*» & Blzzl» E* (1978)* Processes controlling arm movements In man-^ 

keys* Science ^ 20 1 > 1^5-1237. 
Ralbert» H* H« (1978)* A model for sensorimotor cohtrol and learning* Bio -^ 

logical Cybernetics » 29> 29-36. 
Ralbert» H. H*» Brotfn» H* B* Jr*» Chepponls» H*» Hastings* E*» Shreve» S* E*» 

& Wlmberly, F* C. (I98l)* Dynamically stable legged locomotion* 

(Technical Report CMU-RI-TR-81-9)* Plttsburgh:Carnegle Mellon Unlversl- 

ty» The Robotics Institute* 
Ralbert» H* T*» & Cralg» J* J* (1981)* Hybrid position/force control of 

manipulators. ASME Journal of Dynamic Syatemai Measurements and Control > 

102 > 126-133* 

Rand» R* H*» & Holmes» P* v\ (1980)* Bifurcation of periodic motions tn two 

weakly coupled van der Pol oscillators* International Journal of 

Nbn-Llnear Mechanics , J5> 387-399* 
Sakltt» B* (19B0)* A spring model and equivalent neural network for arm pos* 

ture control* Blploglcal Cybernetics » 37 » 227-23^* 
Saltzman » E* (1979)* Levels of sensorimotor representation* Journal 

Mathematical Psycholofly > 20» 9U163* 
Schmidt* R. A* (1982)* Motor control and learnlng i A behavioral emphasis . 

Champaign* IL: Human Kinetics* 
Schmidt* R* A*» & McGoifn* C« (1980). Terminal accuracy of unexpectedly load* 

ed rapid movements: Evidence for a mass-spring mechanism In programming* 

Journal of Motor Behavior * 12> 1«9-l6l. 
Soechtlng* J* F* (l982)* Does position sense at the elbow joint reflect a 

sense of elbow joint angle or one of Unb orientation? Brain Research* 

248 * 392-395* ♦ 
Soechtlng* J* E** & tacquanlti* F* (I98l)* I*ivarlant characteristics of a 

pointing movement in man* Journal ^f Heurosclencet J_* 710^^720. 
Stein, R* B* (1982)* nl^t muscle variables does the central nervous system 

control? Tlie Behavioral and Brain Sciences * 5, 535-577* 
Szentagothal* J** & Arblb* M* A* (EdsTT. (1974). Conceptual models of neural 

organization* Neurosclences Research Program Bul^letin t i2(3)* 

ERIC , 52 



SaltzMn & Kelso: Skilled Actions: A Task Dynamic Approach 



Turvey» M. & Shaw» R. £• (1979). The prinacy of perceiving: An ecologi- 
cal reformulation of perception for understanding mtmry. In L«G. Nils- 
son (Ed. )| Perspectives o n aeBory research : Essays in honor of Oppaal^ 
' University's 500th anniversary . Hillsdale^ HJi Erlbaum. 

Turveyt M. T.» Shaw, R. £.» ( Hace» W. (1978). Issues in the theory of ac-* 
tion : Degrees of f reedout coordinati ve structures and coali tions . In 
J. Requin (Ed,)^ A ttention and performance VII . Hillsdale? W: Erlbauw. 

Vivianit P.t & Terzuolot ^ ^ (1980). Space-tiow invariance in learned motor 
skills* In G. E* Stelmach & J* Requin (Eds.)» Tutorials in motor be hjy- 
tor . Amsterdam: North-Holland . 

Whitney» D. E. (1972). The mathematics of coordinated control of prosthetic 
arms and manipulators. ASHE Journal qf D ynamic SystemSi Measurement ^nd 
Control s 303-309, 



® tec 

ERIC 



Saltzm*:.! & Kelso: Skilled Actions: A Task Dynamic Approach 



A ppepdix A (Equation 70 

The body spatial variables (i* t) equation (6) are transformed into 
the joint variables {6f or a mass less arm model using the following 

kinematic relationships: " 



X = JU)^ * (dJ(tf)/<lt}(l 
^ J(^)ii ♦ V(rf>ip, where 



(AD 
(A2) 
(A3} 



x(jtf) ^ the ;jurrent body spatial Position vector of the terminal device ex-* 
pressed as a function of the currerit model arm configuration; 

T 

= (1^ sin^^ * I2 sin(4f^ * ^2^* •'l^cos^^ * l2Cos(d^ * ^2)) I 
* T 

if c [i^^v ^1^2' ^2^ * current Joint velocity product t^tor; 



J(rf> * the Jaccjian transform matrix; 
^ (1 ^cos^^ -i^ l2Cos(df^-f 



(l|Sin«f| * l2Sin(«f^ -^^2^ 



l2CO£ ^2) 
l2Sin(d^ * ^2^ 



V(4) = a matrix resulting from rearranging the terma of the expresaiu. 
'*(dJ(d)/dtO^ in order to segregate the Joint velocity products into a sin* 
gle vector 2 ; 

(-l^sinrf^ *l2Sin(rfi + ^2^^ '^2^J/^**1 * *2^ *l2^in(rf^ * 1*2) 

( l^cos^^ + l2COS(rf-| + ^2^ 2l2COs(rf^ + ^2^ l2C0s(rf^ + ^2^ 

Making these substitutions into (6) and rearranging we get equation (7): 



13^ O O **p 



It should be noted that since ^ in equation (6) is not assomed "snails" 
the differential approximation ^ = J(j^)^ is not Justified and, therefore, 
equation (At) was used instead for the kinematic displacenent transforcation 
into model arn variables. 
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Appendix B (Equation 1) 

On^ may derive using a Lagranglan analysis (see^* for exan^le, Saltzmant 
t979v for details) the ptisslve mechanical equations of motion for the 2-seg- 
ment arm (f rlctlonless, no gravity) described In the text: 

"Aft*S^Sp=5» "here (At), (9) 

* acceleration sensitivity matrix with elements 

<i^y where 

q^^ = DJgd*) ^ (t/3)l2 ^ l^lgCosOg) ^ m^n/3)l^, 

q^2 = m2({V3)l2 ^ ( 1/2)1 ^IgCOsOg^ 

^21 = *ll2 

^22 ^ (1/3)m2l^2' 

= a 2x3 matrix with elements s^^, resulting from 

rearranging the terms of the coriolis and centripetal torque terms 
In order, to segregate the Joint velocity products into a single 
vector Op, where 

^11 • ^12 " "*"2-''1-''2^^"®2' ^13 " <1/2)s^2 
^21 " '^13' ^22 " ^' ^23 ^ ^ 



Appeodlx C (Equation 15) 

1) K^. We begin with the expression M^K^Agl^ from equation (ll ). 
Since we assume that ^ Is **small,** we are Justlflecj in making the differen- 
tial approximation: 

^"a*^a^^'o = f"X^"^^^'o • ^^^^^ 

^o 

i$ = JS^S^ " l£tj? ) denotes the differential body space displacement between 
the terminal devices of the real (articulator network) and model (task net- 
work) arms, and CM^*^4J" 3|g denotes the articulator stiffness pat- 

tern governing the real arms^s responses to small displacements about x(0 = 

The body spatial stiffness responses of the model arm specified by task 
dynamics for (possibly) large scale displacements ^(j$) * £(j^) * x from the 
reaching target are governed by the spatial restoring force term 
[J" MgKg^(^)] 1^ in equation (8). Assuming that the model - 0^) 

and real (9) arm. configurations are "close," we compare the stiffness expres- 
sions and define the following constraint relationship: 
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This relationship specifies that stiffness responses of the real arm ta small 
4g perturbations will be defined according to task space axis dissections and 
task space stiffness weightings. 

X*a' ^ssuiiiing that both 45 and are **sroall,*' one may use equa- 
tions a8) ana ^i^b) to define the following constraint relationships: 

\ - ^\''\\'^\^ ' ""^ ^^^^ 
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Footnotes 

^An effector system l3 the set of limb segments or speech organs us^d 
in a given action! a terminal device or end effector is the part of a con* 
trolled effector system that is directly related to the goal of a performed 
action* ThuSv In a reaching task* the hand Is the terminal device and the arm 
is the effector system; in a cup^o-mouth task* the grasped cup is the terml^ 
nal device and the hand^r^n system Is the effector system; In a steady state 
vowel production task* the tongue body surface is the terminal device and the 
jaw-tongwe system Is the effector system* 

2 

Different systems may have different types cf escapements* For exam- 
ple* van der Pol and Raylelgh oscillators have related escapement terms that 
are continuous functions of the systems' states; the pendulum clock's escape- 
ment term is a discontinuous function of the system's state. Injecting a pulse 
of energy at one or two discrete points In the cycle* 

3 

For a task In which an arm Is nonredundant > the number of controlled 
spatial variables for the terminal device is equal to the number of controlled 
joint angular variables for the arm* Hence the inverse kinematic transforroa- 
^ion from spatial motions of the terminal device to corresponding arm joJnt 
angular motions is determinate* For a task in which the number of joint varl* 
ables exceeds the nuober of spatial variables* this transformation is 
indeterminate and the arm is redundant* For redvndant arms^ one may specify 
the Inverse kinematic transformation by: a) **f reezlng** the extra joints in 
the arm; b) adding extra controlled spatial variables to the task descrlp^ 
tlon; or c) specifying optlmallty criteri^^ to be satisfied for the joint 
variables during the movement* 

4 

Indeed, herein lies an important difference between the various ver- 
sions of the mass*sprlng model (or equilibrium point hypothesis for discrete 
targeting behavior)* In one widespread view that is restricted tc single de^ 
gree of freedom motions^ muscles are represented by a pair of springs acting 
across a hinge in the ; ^onlst -antagonist configuration* The final equilibrium 
point is established t selecting a set of length-tension properties in oppos- 
ing muscles Ce*g., Bi ,^1, 1980; Cooket 1980; Kelso, 1977)* Thi5 view, at 
best, may work for deafferented' misde, but* as pointed out by Fel'dman and 
Latash < 1982* p* 178) it is Inadequate for muscles in natural conditions* 
Moreover, as we have taken pains to point out, it does not work for complex, 
znjltlvarlable tasks* An alternative view, which we elaborate upon here, is 
that the parallel between a single muscle and a spring is not a literal one* 
In^tecdi the mass*sprlng model is better viewed as a model of equlf Inallty or 
mo to r equivalence : it is this abstract functional property that particular 
behaviors share with a mass-spring system (Kelso, Holt* Kugler* & Turvey, 
1980; Kelso & Saltzman, 1982)* In short, the former i articulator dynamic 
version is a hypothesis about a physiological mechanism whose shortcomings 
have been noted (Bizzi, Accornero, Chappie* & Hogan, 1982; Fel 'dman & Latash, 
1982)* The latter, abstract dynamic version refers to a complex system, and 
is a hypothesis ^out behavioral function* 

5 

Currently, our task -dynamic formulation does not include precision 
force control tasks* It can be easily adapted for tasks that demand particu- 
lar motion patterns along a surface and only approximate control of the force 
exerted by the terminal devloe normal to the surface (e*g*, polishing a car, 
erasing a blackboard)* The approach can also be adapted for precision force 
Q control tasks, however, as demonstratec* by Hog;»n and Cotter (1982)* 
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Mason (1981; see also Ralbert & Craig, 1981) has formalized a related 
geometrical description for manipulator contact tasks in which different tasks 
are characterized by distinct generalized surfaces in a constraint space. In 
this task*specific constraint space* the task degrees of freedom are parti* 
tioned into those associated with either position or contact force control* 
respectively* during performances of the associated task. Such an approach 
requires* however* explicit task* and contexto^peclfic position and force 
trajectory plans for the task's terminal device. In contrast* the task dynam- 
ic approach requires no such explicit trajectory plans* due to the 
task*specific dynamical topologies defined for the task-space degrees of free* 
dom. In our fornsulation* then* task*appropriate terminal device trajectories 
are emergent properti^ implicit in the corresponding underlying task*dynamic 
organizations. 

7 1 
In redundant task -articulator situations (see footnote 3)* is 

not defined and the Jdcobian pseudoin verse (J^^) or weighted J^cobian 
pseudoinverse (J*) may be used (Benati et al.* 1980; Klein & Huang* 1983; 
Whitney* 1972). Using J* provides an optimal weighted 'east squares solution 
for the differential transformation from spatial to joint motion variables. 
If this weighting is task dependent* then J* would be both task- and 
conf iguration*dependent. For exaiiq;>le* if a three*joint arm is ui^ed to posi- 
tion the fingertip in a spatially planar reaching task* different weightings 
would correspond to different arm joint motion strategies . One weighting 
'night correspond to a predominantly shoulder motion strategy* while a second 
weighting might specify a predominantly elbow motion strategy* etc* In such 
cases* elements of the weighting matrices used for the corresponding weighted 
Jacobian pseudoinverses define a further set of tuning parameters for the task 
network, 
g 

As we demonstrate via simulation (in the Trajectory Shaping section) 
in the case of our task space point attractor reaching exaiq>le* it may be 
possible to ignore the velocity product torque terms* and therefore omit X 
from equation (12)* yet still arrive at tht: desired target via quasi-strai^f 
line hand trajectories. In fact* reach trajectories generated without such 
correction appear more similar to experimentally observed trajectories than 
ones generated with **perfect** velocity product torque correction. 

9 

The desirability of using such scaling coefficients was pointed out by 
Mason (1931). In addition to using them to ensure dimensional homogeneity* 
Nason showed that different values could be used to provide correspondingly 
different weightings of rotational vs. linear aspects of tisk performances. 
However* since the task dynamic approach uses relative task axis stiffness 
weightings for this purpose* the value of was siinply set to 1.0 in our 
treatments. 

10 

For task spaces not defined by point attractors along each task axis* 
however* equation 29a will no longer hold. For example* if a given task has a 
linit cycle organization for one task axis* and therefore a nonlinear daii¥>ing 
term* the and hence Bg matrices will reflect only the linear negative 
part of this damping. If were used In equation 29a* the articulator net- 
work should b^ highly unstable. In ^uch cases* however* one might sii^ply 
choose to make the articulator network stable about j6^, given the 
sppcifiecT in equation 29o. 
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SPECULATIONS ON THE CONTROL OF FUNDAMEKTAL FREQUENCY DECLINATION* 



Carole E. Gelfer,+ Katherine S. Harris,* R ene Collleri^^^^ and Thomas B3er 



Introduction 

It l3 generally assuioed that» for read speech at least , the fundamntal 
frequency of the voice declines over the course of rna jor syntectlc 
constituents. These units correspond to what has previously been termd the 
**hreath group** (Ueber&an* 1967; Lleberman» Savashima^ Harris^ & Gay, t970) 
or **lntonatlon group*' (Breckenrldge, 1977), being marked on either end by a 
pause and/or Irtsplretlon. The general downdrtft of u -exclusive of local 
perturbations secondary to syllable prominence and segtoental effects^ and Is 
probably best characterized by a steadily declining baseline upon which these 
local novements are superimposed (Cohen» CoU^er, & t^Hart, 1932; Fujlsaki & 
Hlrose, 1982). 

Variations In subglottal pressure (P^) and cricothyroid (CT) ouscle 
activity are tnought to bear rsost directly on variation^ although it has 
been difficult to separate the CT's contribution to the global prosodlc struc- 
ture of an utterance from Its Involvem-nt In ongoing local adjustments. How-^ 
ever, despite these methodological problems, there has been little evidence to 
suggest a gradual decline In CT activity corresponding to that In Fg* Rath- 
er, the CT*s most active Involve^nt In Intonation appears to be confined to 
Instances of local emphasis (e.g. , Collier, 1975; Haeda» 1976). Subglottal 
pressure, on the other hand, does exhibit a declination of Its own that at 
least grossly mirrors the F^ contour (Atkinson, 1973; Collier, 1975; Lleber- 
man» 1967; Haeda, 1976), thus suggesting that Fq declination might be a 
passive phenomenon. However, despite the apparent relationship between and 
Fq, atteinpts to establish a direct correlation between the two (Atkinson, 
1973; Haeda, 1976) have been unsuccessful In that the drop in F^ exceeds the 
3-7 Hz/cm-HpO that a purely passive model would predict (Baer, 1979; Hlxon, 
Klatt, & Head, 1971; Ladefoged» 1963). 

Some researchers have proposed that declination, and ine physiological 
processes underlying It, Is under active speaker control. This assumption de- 
rives In part from observations of variations In som<^ aspects of F^. as a func- 
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tion of utterance length. Cooper and Sorenson for exan^le, found sig- 

nificant, if not robust, increases in initial peak F^ for progressively longer 
utterances, while Breckenridge (1977) and Maeda n976) observed the total 
aioount of declination to be relatively constant under the same conditions. 
However, there is a large amount of data to suggest that final F^ values are 
invariant despite changes in the length of utterances (Boyce & 7lenn» 1979* 
Cooper Sl Sorenson, 1981; Kutik, Cooper, & Boyce, 1983; haeda^ 1976), initial 
starting frequency (Liberman & Pierrehuinbert, 1982), or the insertion of de* 
pendent clauses such as parentheticals (Kutik et al.. 1983). What these re- 
sults seen to suggest^ then, is that, as an utterance increases in length, ei- 
ther the total amount of declination increases or the rate of declire 
decreasea. However^ it is not entirely clear whether 1) these are mutually 
exclusive aspecfts of F^ declination and 2) length-dependent variations in Fq 
necessarily refute the predictions of a passive model of declination and favor 
theories Involving elaborate speaker pre-planning. 

The present study examined the F^ declination » and some physiological 
variables presumed to underlie it» under various linguistic conditions. Our 
purpose was to elucidate further the relationships amang these variables and 
to speculate whether speakers exercise significant control over any or all of 
then. 

Methods 

The subject waa a native speaker of Dutch who produced five repetitions 
of Dutch utterances of three lengths; six» thirteen, and twenty syllables. 
Mean utterance durations were 1.35^ 2.065t and 3.02 seconds, respectively. 
All three lengths had the first four syllables In coranon; for the longer 
utterances^ the first eight syllablies were Identical (see Appendix). Each 
utterance typ^was also produced in relterant form^ using either the syllable 
/ma/ or /fa/. The purpose of eiploying reiterant speech was to neutralize 
segmental effect3 while preserving overall intonation and syllable timing 
(Larkey» 1983; Liberman & Streeter, 1978). In addition, by using syllables 
with expected differences in airflow requirements, the effect of these differ- 
ences on subglottal pressure and» possibly, F^, could be assessed. 

For each length condition^ emphatic stress was placed either on the first 
syllable receiving lexical stress (the second syllable in the utterance)^ the 
last sylUble receiving lexical stress (the penultimate syllable) or both. We 
will refer to these as early» ldte» and double stress conditions, respective- 
ly* In all» there were twenty<^even utterance types {3phor!tic conditions x 
3 stress oonditlons x 3 length conditions). All tokens were aligned to the 
onset of the second vowel and averar^ed for each utterance. 

The results were analyzed with t espect to the effects of utterance length 
and syllable eiq>hasls on initial F^, P , CT, and respiratory activity, and the 
magnitude and rate of decline in eac^ of these variables over entire utter- 
ances. 

Subglottal pressure was recorded by means of a pressure transducer 
inserted through the cricothyroid membrane into the trachea. Standard EMG 
techniques were used to record from the cricothyroid muscle (Harris, 1981). 
Lung volume was inferred from the calibrated sum of thoracic and abdominal 
signals from a Respitrace Inductive plethysmograph, and was derived from 
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the output of an accelerometer attached to the pretracheal skin surface. A 
cepstral technique was used to extract Fq from the signal. 

Results 

Figure 1 shows Respitrace comparisons for each phonetic condition across 
stress types for Length 2 utterances. Within utterances of a given phonetic 
cooipusition (i.e.* Dutch, /hia/ or /fa/)t the rate of air expenditure appears 
to remain constant within stress condition! as is obvious from the generally 
parallel tracioAS. However, the peak inspiration varies inconsistently across 
parallel sets. ThU3, it would seem that, on the respiratory level* loc;^l 
variables such as the degree or place of emphasis were not reflected In the 
air flow management of this speaker*s utterances. 

Across phonemic conditions, however* airflow rates cio differ, as Is evi- 
denced by the apparent differences in the rate at which these curves decline. 
The left-hand section of Figure 2 shows a coDq)arison of the Respitrace curves 
for each phonetic condition for early stress across three utterance lengths. 
It appears that airflow rate for the /fa/ condition always exceeds that for 
the /ma/ condition! while, for the Dutch* air expenditure is aiore variable. 
The obvious question is whether these differences in airflow are reflected in 
the pressure. From the corresponding subglottal pressure tracings on the 
right of Figure 2, it can be seen that they are not. Furthermore, while loca\ 
segmental effects are apparent in the curveSt particularly for the^ Dutch 
utterances, it is also apparent that, for the three comparisons maoe at each 
length* a single line could characterize the decline of subglottal pressure, 
despite . the variations in phonetic composition and concomitant airflow 
characteristics. 

Because of th€ demonstrated uniformity of across phonetic conditionSt 
the remainder of this paper will focus on the analysis of the relterant /ma/ 
Ltterances on the ar gumption that they are at least generally representative 
or normal speech. 



Table 1 

Peak inspiration (left) and total inspiration (right)* in liters* for the 
three length conditions across all stress types. 



Peak Inspiration Amount Inspiration 

(liters) (liters) 







Early Double 


Late 


Mean 


Early 


Double 


Late 


Mean 


Length 


1 


3.83 


4.06 


4.05 


3.98 


.85 


.93 


1.17 


.98 


Length 


2 


4.1 


4.35 


4, 12 


4,19 


.99 


1.36 


1.11 


1.15 


Length 


3 


4.23 


4.16 


"i.og 


^..16 


1.41 


^.6', 


1.26 


1.43 


Mean 




4.05 


4.19 


4.09 




1.08 


1.3 


1.18 
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Si RC 
LENGTH 2 



LUNG VOLUME 




UTT: DUTCH 



EARLY STRESS 



DOUBLE STRESS 

LATE STRESS 



(0 




UTT: MA 




UTT^ FA 



e 1 a 3 seconds . 

Figure 1. Comparison of Respltrace curves for Length 2 utterances across all 
stress types shown for each of the three phonetic conditions. 



EARLY STRESS 

LUNG VOLUME SUBGLOTTAL PRESSURE 




SECONDS 



Figure 2. Corresponding Respi trace (left) and subglottal pressure (right) 
curves for the early streas condition across the three phonetic 
condition?. Comparisons are shown for each utterance length. 
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Figure 3 again shows Respltrace curves for each stress type across the 
three utterance lengths. It can be seen that there are no visually signif- 
icant differences In the rate of expiration nor evidence of systematic adjust* 
ments 1.1 peak Inspiration as a function of anticipated length, xt Is the 
case* though* that the depth of Inspiration i'^ith the exception of one utter*- 
ance type) appears to be adjusted according to utterance length. This Is evi- 
dent from the values In Tacle 1* which shows both the point of pe^ Insplra**^ 
tlon and the amount of Inspiration (calculated by subtracting the^ preceding 
vallry from the peak) for each utterance. However* because the experiment was 
designed in such a way that all tokens of a particular stress type were pro-* 
duced In blocks of utterances of Increasing length* It Is Impossible to deter- 
mine the significance of this finding. In other words* because Length 3 to«- 
kens were always preceded by tokens of the same length or* In one Instance* by 
the last token of the shorter length utterance* Inspiration necessarily began 
at a point lower In this speaker's vital capacity than* for exacple* Length 2 
tokens* ifhlch could have only been preceded by tokens of the same length or 
shorter. Thus* we are unable to determine whether the Increase In the depth 
of inspiration as a function of length represents an artifact of experimental 
design or evidence of anticipated pulmonary requirements. Overall* the 
Respltrace data fall to demonstrate conclusively the manner or extent to which 
this speaker makes prephonatory adjustments of this kind under the various 
conditions. However* in light of the otherwise uniform nature of these 
Respltrace curves* and the absence of any obvious relationship with the 
subglottal pressure* their influence on the ultimate t:ajectory of fundamental 
frequency declination appears questionable. 

Figure 4 depicts the Fq contours for the three stress conditions for each 
utterance length. These contours probably represent what has been .termed 
^baseline declination** In as pure a form as possible In that significant seg- 
mental effects are absent. For the early and double stress conditions* there 
Is an obvious peak associated with every enphatlc syllable* and a consistent 
Initial peak height difference as a function of utterance length. However* F^ 
does not decline steadily from these peaks. Rather* there Is a rapid drop In 
frequency to a point from which F^ then begins a steady decline. While the 
time course of this Initial plunge Is constant across lengths* despite differ- 
ences In peak height* the points from which the slow decline begins for each 
length are not* bearing Instead the same relationship as the Initial peaks. 
This relationship appears to be maintained throughout the course of at least 
the longer utterances* although they appear to decline in parallel. In the 
absence of early ecnphasls In the late stress condition* the F^ pe^s occur up- 
on Initiation of the utterance and are thus displaced in time relative to the 
second syllable peaks In the former two conditions. Furthermore* the decline 
of Fq from these pe^s Is far more gradual and less strikingly parallel. How* 
ever* It Is of some Interest to note that the relationship of these nonemphat- 
Ic Initial peaks across lengths Is the same as for their emphatic 
counterparts. 

Figure 5 shows the corresponding subglottal pressure tracings. It can be 
seen that the same general tendencies prevail. That Is* there Is an effect of 
utterance length on the Initial peak pressure and a relatively rapid Initial 
pressure drop Into a more-Dr*less parallel and steadily declining function for 
the longer utterances. Again* the peaks occur earlier In the late stress 
utterances and the Initial pressure drop Is less rapid. 

If the Fq and tracings are examined in parallel* It becomes apparent 
that there Is a polnr In time* following the Initial peaks* after which the 
decline In F^ almost mirrors that of P^* However* the parallelism Is less 
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Figure 3* Reapitrace curves for reiterant /ma/ utterances across lengths. 
Comparisons are shown for each stress condition. 
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Figure fundamental frequency curves for reiterant /ma/ utterances across 
lengths. Coinparisons are shown for each stress condition. 
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Figure 5. Subglottal pressure curves for reiterant /ma/ utterances acrcss 
lengths. Comparisons are shown for each stress condition. 
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Figure 6. Averaged Cricothyroid imjscle activity for reiterant /ma/ utterances 
across lengths for each stress condition. Final peaks for each 
utterance length ere denoted by the numbers above these peaks. 
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obviou? for the earlier portions of early and double stress utterances* since 
the most rapid drop in is far more gradual than that for F^. 

Figure 6 shows CT activity across lengths for the three stress condi- 
tions* The overall CT pattern differs from those of P and Fq in that, al- 
though inherently noisy* CT activity appears to be relatively binary in utter- 
ances of this form. There £<re significant increases in CT occurring for ini- 
tial syllables, whf^ther stressed or not« and for all final stressed syllables, 
which in this figure are marked in the double and late stress conditions 
according to their respective lengths. CT activity during the early: portion 
of these utterance is characterized by double peaks whose timing is identical 
fjcross stress types, but whose relative magnitudes differ with stress type» 
corresponding to the placement of the P^ and Fq peaks. The double peaks 
associated with the final stressed syllables of Lengths 2 and 3» however* are 
the result of averaging events that are distant from the line-up point in to- 
kens of slightly unequal lengths* and are not characteristic of CT activity 
for final stress peaks. 

In order to examine the effect of anticipated length on the initial por- 
tions of utterances, we cotnpared initial peak values of CT, P^, and F^ for 
each stress type across lengths. It should be recalled that the initial 
utterance peaks for P and Fq in the late stress condition were displaced rel- 
ative to those with early and double stress* while the timing of the CT peaks 
remained constant irrespective of stress type. In the interest of consisten- 
cy, then, the values reported here for P and Fq in the late stress condition 
are those that correspond in time to the peaks f'or the other two stress condi- 
tions and, thus* actually represent values on the declining portion of these 
curves. The results are shown in Table 2. It can be seen that a consistent 
effect of sentence length obtains for every stress condition for all physio^ 
logical and acoustic n^asures. 

If the corresponding F^ and CT curves 3re examined in parallel, there ap- 
pears to be a close correspondence between the time course of the CT suppres- 
sion and the point at which F^ begins its steadiest decline. We would thus 
hypothesize that the combined activity of CT and P^ accounts for the behavior 
of F^ near peaks* but not during the period of Fq slow decline. 
acknowledge* of course* that the activi'^y of a number of muscles* not moni-* 
tored in this study, may also have causal effects on F^, 

V 

Assuming, then* that CT plays little or no active role in F^ declination* 
we examined the relationship between P and F^ in two different ways. First* 
the amount of drop in F^ and P was calculated between the point at which the 
CT activity ceased and tne end of the utterance in the early stress condition, 
and between CT cessation and the minimum values Just preceding the last peak 
in the double and late stress conditions. In the second analysis, we used the 
average duration of Length 1 of the early stress utterances as a fixed end- 
point and determined the amount and rate of F^ and P decline between the off- 
set of CT activity and this fixed endpoint for all utterances. 

The offse" of CT activity was defined as the time at which the EMG output 
(iKkeasured in microvolts) dropped to and remained below a level equivalent to 
the baseline plus 10$ of the peak level. These analyses w^re not performed on 
Length 1 of the double and late stress conditions. In the former condition^ 
the interval between CT offset for the first peak and CT onset for the second 
was too short. In the latter condition, CT activity was never consistently 
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Table 2 



Initial peak measurements of cricothyroid activity, subglottal pressure, and 
fundamental frequency for the three length conditions across stress types. P 
and Tq values for the late stress condition do not represent absolute pea^ 
values (see text). 



Initial Peak Values 





Early 


Double 


Late 


Mean 




.ength 1 


202 


273 


159 


211 




Length 2 


296 


277 


169 


217 


Cricothyroid 


Length 3 


310 


331 


189 


277 




Mean 


269 


29^ 


172 







Length 1 


8.3 


9.9 


7.1 


8.1 


Subglottal 


Length 2 


9.7 


10.7 


7.1 


9.3 


Pressure 


Length 3 


9.9 


t1.3 


8.2 


9.8 


(cm-HjO) 


Mean 


9.3 


10.6 


7.6 







Fundamental 

Length 2 HI 113 HI 132 Frequency 

u ■ ■— — iHz) 



Length 1 


135 


137 


102 


125 


Length 2 


111 


113 


111 


132 


Length 3 


166 


158 


121 


119 


Mean 


117 


116 


112 





Table 3 



Analyses of rate of decline In and P across lengths for each stress condl*y 
tlon, calculated for (1) the Interval from the point of CT offset to P^ minima 
(variable Interval) and (2) the Interval from the point of CT offset to a 
fixed endpolnt corresponding to the average duration of Length 1 of the Early 
stress condition (constant Interval). The frequency*to-pres:^ure ratios are 
also shown for each analysis. 

ANALYSIS 1 ANALYSIS 2 









FO 


H 


FO/Ps 


FO 




FO/Ps 




Length 


1 


22.21 


3.91 


5.61 


22.21 


3.91 


5.61 


Early 


Length 


2 


11.39 


2.17 


5.83 


19.7 


3.73 


5.28 




Length 


3 


7.03 


1.07 


6.57 


17.12 


3.95 


5.2 




Length 


1 














Double 


Len({th 


2 


15.37 


2.38 


e.i6 


22.52 


1.12 


5.17 




Lei'.gth 


3 


10.76 


1.36 


7.91 


19.75 


3.19 


5.66 




Length 


1 














Late 


Length 


2 


20.79 


2.57 


8.09 


17.32' 


2.61 


6.61 




Length 


3 


16.85 


1.5€ 


10.8 


35.22 


3.52 


10.01 
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auppreaaed, so that an offset tine could not be obtained. Furtheroorev in 
both cases the designated interval for the second analysis extended into the 
final stress peak. These analyses were perrora^d on a token -by -toktn basis in 
order to accommodate variability in the timing of CT activity. 

Table 3 shows the results o^ both analyses in terms of F^ slope (Hz/sec), 
slope {ctt-HjO/sec J» znd the frequency-to-pressure ratio (Hz/cm-HjO). Look- 
ing first at tlie ratios from both analyses* it should be noted that six of the 
seven values from Analysis 2 rail within the acceptable range cf 3*7 
Hz/cm^jO* while only four of the seven values from Analysis 1 fall within 
this range. However * even those values that fall outside the range a^c 
considerably lower than those reported when the effect, of CT and possibly 
other muscle activity are not neutralized (see Haeda, 1976). ThuSy a passive 
mechanism whereby Fq declination is determined by a steadily falling subglot- 
tal pressure should oe reconsidered. 

Discussion 

As for the influence of utterance length on the slope of F^ and P 
change, the results of Analysis t show a substantial decrease in the rate of 
change with increasing length. This effect has been observed in previous 
studies and assumed to represent hiiih level preplanning whereby certain i^ysi- 
cal aspects are represented in a speaker's uttera.nce plan* However, when 
slope is calculated over fixed portions of these same uttercmceSi as in Analy- 
sis 2, the length effect observed in Analysis 1 is substantially lc3sened* 
demonstrating a more constant rate of decline:- across lengths. (For Length 3 
of the late stress condition^ there is probably some peculiarity in the data* 
particularly for F^. ) The results of the latter analysis furi^her suggest that 
neither Fq nor P decline at a constant rate across an entire utterance. If 
they did, we would expect the slopes to be identical over any portion of a 
given utterance* despite its length. However * the results of Analysis t 
demonstrate that this is not the case. It ap^/ears that* . with the obvious 
exception of the late stress utterances* the rate of decline in P^ and Fq is 
greatest earlier in an utterance* as. is indicated by the steeper slopes in the 
second analys is* and that these *2urves would be best characterized by an 
exponential function. Thus, the apparent **length effect** that we and others 
observe when slope is calculated over an entire utterance can probably be 
attributed to che nonlinear nature of F^ declination and not to elaborate 
precalculations or ongoing reorganization on the basis of utterance to length. 
Our data substantiate the claims of Liberman and Pierrehumbert (1982) that the 
Fq contour gradually approaches an asynptotic value* as well as Haeda'3 find* 
ing that the latter portion of some utterances may not show any evidence^ of 
declination. 

The systematic ac, 'stments in peak Fq' suggest that* on some l<;vel* this 
speaker does take sentence length into account. However* these p^aks do not 
appear to influence the trajectory of the total declination contour. Rather* 
^heir Influence, appears to be limited to tneir immediate vicinity* probably 
including th^ frequency from which declinaMon actually begins. However* the 
latter is probably a function of te(H)oral constraints whereby* iji a fixed 
amount of time, the frequency to which Fy falls is a function of the frequency 
from which it starts. Thus* whatever its purpose* manipulating peak height 
does not appear to be es.^ential to the realization of declination* pe/ se. 
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In summary I we have found that* for relterant utterances composed of 
voiced continuants, where normal segmental adjustments were presumed to be 
neutralized* CT activity was prominent In Instances of emphatic syllable 
stress* and relatively Inactive elsewhere. Sub)£lottal pressure* on the other 
hand* showed a gradual decline before and/or after stress peaks and was 
paralleled by a falling fundamental frequency. Thus* while we cannot rule out 
effects such as vocal fold relaxation on and P during these Intervals* the 
data do suggest that* whe^e CT activity Is negligible* Fq declination can be 
accounted for on the basis of a falling P_ alone. 

9 

Our conclusions at this point must be tentative for two reasons: First* 
because we have anlayzed the data of only one subject and second* because 
there are Inconsistencies between the late stress utterances and the other two 
stjress conditions. However* we believe there are strong Indications that 
declination may be the province of low-level processes such that variations In 
certain aspects of Fq are the result* not of hljgh-level (I.e. cognltlve- 
ly-generated) planning processes* but of the Intrinsic behavioral properties 
of underlying physiological systems. 
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Footnotes 

^This should not be interpreteo as meaning that there is not a gradual 
relaxation of the muscle» but that the EMC activity associated with 
cricothyroid contraction does not appear to lessen gradually over the course 
of an utterance, 

2 

The durations of the reiterant utterances are» on average, somewhat 
longer than the corresponding Dutch due to the intrinsic duration of /a/ 
and/or the inadvertent addition of extra syllables. Those tokens for which 
the latter was evident were still included in all analyses on the assumption 
that the speaker's intention was to produce an utterance of a given duration* 
5o that any length -dependent adjustments would be identical, 

3 

As used here» peak inspiration corresponds to the maximum amplitude of 
the output signal of the Respitrace imnediately preceding the onset of speech, 
Becausi of the built-in filter characteristics of the Respitrace > hcft*ever> 
this point may not represent actual peak inspiration. Furthermore^ baseline 
drift and positional dianges may introduce artifact into the signal as well. 
Therefore, peak inspiration measures should be interpeted with caution. 

It ft 

The actual peak values for P« and F^ for the late ctress condition evi- 
dence a similar length effect. They are» in order of increasing length: 

P^: 7,9, 8,U 9,0; F^: 1l3, 119, 139, 
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Early Stress 

Length 1: 
Length ^: 
Length 3: 



Double Stress 

Length 1 
Length 2 
Length 3 



APPENDIX 



Je weet dat Jan nadenkt. 

Je w e et dat Jan erover nadenkt te betalen. 

Je weet dat Jan erover nadenkt ons daaroor met genoegen te 
betalen. 



Je w eet dat Jan nadenkt. 

Je weet dat Jan erover nadenkt te betalen. 

Je weet udt Jan erover nadenkt ons daarvoor met genoegen te 
betalen. 



Late Stress 



Length 1: Je weet dat Jan nadenkt. 

Length 2: Je weet dat Jan erover nadenkt te betalen. 

Length 3: Je weet dat Jan *-over nadenkt ons daarvoor met genoegen te 
betal ^. 
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SELECTIVE EFFECTS OF MASKING ON SPEECH AND NONSPEECH IN THE DUPLEX PERCEPTION 
PAWDIGM 



Shlomo Sentln^ and Virginia A, Mann++ 



Abstract . Perception of second foroant transitions isolated from a 
synthetic [ba] or Cga] syllable as nonspeech "chirps" or as support- 
ing identification of a stop consonant was investigated using the 
duplex percepition phenotoenon. This phenomenon arises when dichotic 
presentation of the transition and the remaining part of a CV syll*^ 
3it/le (the base) allows the transition to support both perception of 
that syXl^le and a nonspeech chirp simultaneously. Over the course 
of four Cispsrissrits it was found that: M Stimulus onset asynchrony 
'of base and transition ittf>airs the accuracy of syllable labeling, 
t It improves "chirp" classification into nonspeech categories, 
whereas a white noise backward mask ipsilateral with the transition 
impairs categorization of "chirps" but not of the syllables support- 
ed by these transitions. 2) Progressive attenuation of the relative 
intensity of the transitions impairs speech perception at a slower 
rate than nonspeech perceptiot?^ 3) A white noise mask preceding the 
base in the ipsilateral ear and presented simultaneously with the 
transition interferes with labeling syllables, but does not affect 
categorization of chirps. 4) A white noise ipsilateral backward 
mask of the transition penalizes both categorization and discrimina- 
tion ot nonspeech percepts more extensively than that of speech. An 
analagous mask consisting of a second formant transition appropiate 
to [da] did not affect nonspeech perception* but iipaired correct 
labeling of the syllables. It is suggested that perception io thp 
speech and in the nonspeech modes is contingent upon activation of 
different central mechanisms. 

Speech perception involves the recovery of phonetic information embedded 
in acoustic patterns that stimulate the auditory nervous system. Frequen* 
cynnodulated acoustical signals, f orman t transitions , can be sufficient cues 
for the perceived distinction between ^wop consonants when they are integrated 
with a syllabic base, in which case they support an abstract phonetic percePt 
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such as **b** or **g.** However^ when the same formant transltlotis are presented 
In Isolation, perception reflects the acoustic characteristics of the 
tli&e*varylng nature of the frequencyniiodulated signal. TtAJS, although some 
subjects might be able to perceive Isolated formant transitions as speech-like 
(Nuabaum, Schwab, & Sawusch, 1983)» most tend to describe them as ''chirps** 
that have no relation to the perceptual characteristics of the stop consonants 
that they otherwise may cue (Hattlngly» Llberman, Syrdal» i Halwes» 1971). 
This radically different perception of the same physical stimulus In two dif- 
ferent acoustic contexts has been taken to reflect two different mdes of au* 
dltory perception In humansi a phonetic mode exclusively dedicated to <^Oeech, 
and a nonphonetlc mode for the perception of other auditory stlpull (Llbermzn. 
1982; Llberman & Studdert-Kennedy, 1978; Mann & I 'erroari, 1983; Repp, 
1982). 

The two different perceptual oiodes are slmultaneCHJsly operative In the 
phenomenon of **duplex** perception first described by ^land ( 1974). Duplex 
perception occurs when the transition of the second and/or third formant, 
which supports perception of a unique stop consonant, la separated from the 
rest of a synthetic syllable and presented to one ear, while the remaining 
acoustic pattern (the base, which by Itself is perceived as a syllable) Is 
presented to the contralateral ear. In this situation, most (but not all) 
listeners slDMltaneousl/ expe.^lence two distinct percepts: One Is the orlgl-^ 
nal syllable that would result If the base and the transition wer« electronl** 
cally fusei; the other Is the **chlrp** sound produced by the frequency modula* 
tlon of the Isolated transition. Interestingly, If fusion occurs (as It does 
for the large majority of listeners), the subject hears both the fused speech 
percept and the nonspeech chjaracterlstlcs of the transition, but not the base 
by Itself. Since the duplicity of perception involves the fused percept and 
only one aspect of the **pre fused** information, the duplex phenomenon does not 
represent merely a lowe- vs. higher hierarchical level of information process- 
ing, but represents two modes of processing, phonetic and nonphonetlc (Ll^oer* 
m^n, 1982). 

in addition to being an interesting experimental demonstration of the two 
perceptual modes, the duplex phenomenon provides an excellent opportunity to 
investigate at what level of information processing they become distinct and 
to understand better the difference between speech and nonspeech perception. 
An advantage in studying this phenomenon is that the two percepts arise from 
one and the same physical stiimilus, so that all that needs to be manipulated 
are the instructions to the listener. It has been shown, for exainple, that 
presence vs. absence of a preceding interval of silence affects dlscrlmlna-^ 
blllty of formant transitions when they support perception of the consonants 
**t** and ^'p** following "S,'' but has no effect on the discrimlnability of the 
same transitions as na s:'eech chirps (Llberman, Isenberg, i RakerU, 1981J. 
This Has taken as evidence that the Inportance of silence in the perception of 
stop consonants is related to **spec if 1 cally phonetic (as distinguished from 
general auditory) processes, and that the effect of silence in such cases is 
an instance of perception in a distinctively phonetic mode^ (Llberman et al., 
1981, p. 142). Further studies have shown that, on the speech side of the du-^ 
plex percept, transitions are perceived categorically, whereas, in contrast, 
the same transitions heard as **chirps" are discriminated continuously accord* 
Ing to onset frequency. Moreover, preposed syllables affect the perception of 
the transitions when they support the perception of stop consonants on the 
speech side of the duplex, but not their categorization as nonspeech chirps 
''Mann i Llberman, 1983). >n , 
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Studies of duplex perception conducted so far have been aimed primarily 
at supporting the distinction between phonetic and nonphonetlc modes of 
perception. This aim has been accomplished successfully Insofar as It has 
been possible to manipulate variables that Influenced the phonetic side of the 
duplex percept, but had no effect on the nonphonetlc perception of the Isolat- 
ed transitions. Yet, If the nonphonetlc node Is truly Independent, one should 
likewise be able to manipulate variables that will selectively Influence the 
perception of the transitions as '•chirps, whll« leaving the speech percept 
unaffected. A careful Investigation of variables that selectively affect each 
side of the duplex percept Is necessary If we are to accept the Independence 
of the two perceptual modes. 

To this end, the present study Investigated the relation between the 
phonetic and nonphonetlc modes of perception by manipulating factors that have 
selective effects on one or the other side of the duplex percept. . Four 
exper^mejits are reported. The first examined the effect of the stimulus onset 
asynchrony (SOA) between the base and the transition, and the effect of white 
noise auditory backward mast ing of the transition on labeling each cojnponent 
of the duplex percept. The second experiment examined the effect of attenuat- 
ing the amplitude of the Isolated second formant (F?) transition on labeling 
each component of the duplex percept. The third experiment presented the 
white noise mask of Experl-* it 1 on the same channel as the base and 
Investigated the effects of .als masking condition on speech and nonspeech 
pei-ception of the transition. The fourth and final experiment examined 
separately the effects of backward masking of formant transitions by white 
noise and by different formant transitions on the labeling and discrimination 
of each component of the duplex percept. 



This experiment Investigated the effects of Increasing SOA on each con^o- 
nent of the duplex percept. Specifically, we reasoned ^hat categorization of 
chirps might be facilitated by an Increase In 30A, whereas speech perception 
is penalized. Cutting (1976) has already reported that when the transition 
and the base of a CV syllable, [ba], Cda], or Cga]# are dlchotlcally present- 
ed, SOA has a destructive effect on the fusion of the two stimuli. In con- 
trast to this general effect. It should be mentioned that In Cutting's (1976) 
study the subjects werf> sometimes able to fuse the translti ^is and the base 
even at. large SOAs of 80 ms and more as suggested by their above--chance cor- 
rect labeling of the syllables as Cba], Cda], or Cga]. However, one problem 
with this Interpretation of their responses Is that the unfused base Is 
frequently perceived as [da], and Cuctlng's Inclusion of this category as a 
correct response confounded possible iionf usion with successful fusion. We 
attempted, therefore, as pari, of the present experiment, to replicate Cut- 
ting's results, but to circumvent the problem posed by the ambiguity of [da] 
percepts In his experiment. 

Another goal of the first experiment was to Investigate the effect that 
monotlc white noise backward masking ot tLe isolated transitions might have on 
the speech and the nonspeech aspects of the duplex percept. The assumption 
underlying the use of backward masking to study perceptual processing Is ttot 
when the onset of the mask Is delayed relative to onset of the target, proc- 
essing of th^ target occurs during the delay, but is Intel rupted by che arriv- 
al of the mask (Turvey, 1973). If speech and nonspeech perception represent 
distinct modes, we might be able to mask selectively one or the other of the 
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two perceptual aspects of the F2 transition. We would then be able to inves* 
tigate the level in the auditory system at which the two modes become separat-* 
ed. 

The decision to enploy a white noise mask was base<j on several considera- 
tions. We wanted a mask that would be most likely to aave a selective effect 
on the nonspeech aspect of the transition. Massaro (l970) has shown that a 
tone is masked efficiently by a different tone even when the similarity be*- 
tneen the pitch of the mask and the target tones is varied over a considerable 
range. On the other hanr!» white noise does not msk categorization of one or 
two tones when^ the intensity of the bask does not exceed that of the test 
tones (Kallman. & Brown 1983)» although it does effectively mask detection of 
tones \£Uiot, 1967). No similar results are available fo* f-he masking of CV 
syllables. However^ assumed that the general un^atter^ed nature of the 
white noise mask wouXt minimize the possibility that it interferes with 
extraction of the phonetic information eirt>edded in the 4transition» while still 
interfering with perception of its nonspeech aspects. 

The first e:cperiment, therefore, had two goals: (a) to investigate the 
effects of Increasing SOA between .the F2 transition and the base, £md (b) to 
investigate the effects of a white noise backward mask presented in the same 
channel as the transition on subjects' ability to label the speech and non^ 
speech aspects of the duplex percept. 



Stimuli . The stimuli Mised to create the duplex percepts sre schematical* 
ly represented in Figure 1. They were adapted from two^formant synthetic 
approximations to the syllables [ba] and tga], as produced on the parallel 
resonance synthesizer at Raskins Laboratories. The pattern illustrated in '>^he 
left panel of Figure 1, whirh we refer to as the "base,** is the constant por- 
tion of the two syllables. Its duration i3 3C0 ms, with a 25 ms anplitude 
ramp at onset, a 100 ms atiflitude ras^ at offset, and a fundamental frequency 
that falls linearly from 11^ to 79 Hz. first formant begins at 100 Hz, 

and during the first 50 ms it increases linearly to achieve a steady^tate 
frequency of 765 Hz. The remaining two patterns, illustrated in the 
right-hand panel of Figure 1, are the F2 transitions appropiate for [ba] and 
[ga]. Each was synthesized separately from ^he base, and is 50 ms in dura- 
tion. Their comnon offset frequency is the steady^tate frequency of the F2 
of the base p230 H*,), and amplitude contour and fundanertc^l frequency are"*^ 
identical to that of the first 50 of the base. The [baJ transition starts 
at 92^ Hz, has a rising frequency contour, and if electroni'ially fused with 
the base, supports perception of [ba]. The [ga] transition stai ts at 2298 Hz, 
has a falling frequency contour, and if electronically fused witi^ the base, 
supports perception of [ga]. The base alone tends to be perceived as a poor 
quality [da]. 

An additional stinulus was created for the purpose of backward masking 
the perception of the F2 transitions. It consisted of 15 ms of white noise at 
intensity 1.8 dB above the maximal intensity of the transitions. 

Tes t tapes . The base, F2 transitions, and white noise mask were digi- 
tized at 10 kHz, and dubsequently recorded onto magnetic tape. Five stimulus 
series were created: Three practice series to acquaint subjects with the du- 
plex percept, and two test aeries to assess the influence of temporal asyn- 



Method 



ERIC 



68 




Bentln & Mann: Selective Effects of Masking on Speech and Nonspeech 



KHz 




ga^transition' 



kHz 



kHz 



2 ' 



2 - 



1 - 




1 - 




* •ba-tran$ition' 



0 50 msec 
base 
(to one ear) 



0 60 
isolated transitions 
(to other ear) 



msec 



Figure 1. Schematic representation of the patterns used to produce the duplex 
percepts. 



chrony and of the white noise mask on the s;>eech and nonspeech components of 
the duplex percept. 

In the first practice series ^ designed to familiarize subjects with the 
speech coiz|)onent of the duplex percept^ the base was electronically fused with 
each of the F2 transitions so as to form two syllables^ [ba] and CgaJ. These 
were recorded five times each» and then ten times In alternation. In the sec- 
find practice series^ designed to familiarize the subject with the chirp compo- 
nent of the duplex percept » the Isolated [ba] and Cga] transitions were 
recorded five times each and then ten times In alternation. The third and fi- 
nal practice series was designed to familiarize the subjects with duplex 
percepts. In lt» the base and the F2 transitions were recorded onto separate 
channels of the tape so as to permit dlchotlo presentation of each transition 
In synchrony with the base. These two duple/ stimuli were also recorded five 
tlmes» then alternated ten times. 

The two te«t series Included only dlohotlc stimuli. As In the third 
practice series » the base and the transitions were recorded onto separate 
channels t but the synchrony of base and transition was systematically 
manipulated. In the first test series, the [ba] and Cga] transitions each 
preceded the base eight times at eight different SOAs; 0» 20» 40, 60» 70» 80» 
90» and ^00 ms. This yielded a total of 128 stimuli that were recorded in 
randomized sequence with Interstlmulus Intervals of 2.5 sec* and longer pauses 
between blocks of 16 stimuli. In the second test series, synchrony of base 
and trj^nsltlon was again manipulated, but each transition was also Immediately 
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followed by the white noise masking stimulus. Each transition preceded the 
baae eight times at three different intervals: 0, 20, and ^0 ins. This yield- 
ed a toti^I of 48 stimuli t recorded in randomized series with the same 
interstinwlus and interblock intervals as In the first test series. 

Procedure 

Subjects listened to stimuli over TDK-39 headphones in a quiet room. 
They were naive as to the nature of the experiment, being told only that ^ts 
purpose was to examine whether perception of speech and nonspeech could be al- 
tered by certain distractor S9unds. They were further advised to attend to 
the percept designated by the experimenter and to ignore all else. The 
experiment began with a pretest that involved presentation of the three prac* 
tice series: the electronically fused syllables, the isolated transitions, 
and the dichotic stimuli. Participation in the experiment proper required 
that a subject be able to distinguish accurately the speech and nonspeech 
P<^rceptd presented In the first two practice series, and to label accurately 
the two components of each duplex percept in the third practice s^aries. In 
this manner we insured that each subject who continued in thv) experiment was 
able to perceive the distinction between the Cba] and Cga] speech percepts and 
the distinction between the rising and the falling chirps, which were the non* 
speech percepts in the duplex listening condition. 



Those subjects who met the pretest requirements went on to participate in 
two experimental sessions, with order counterbala.iced across subjects. In one 
session, their tadk was to label the speech percepts as containing **b** or **gv** 
while ignoring the nonspeech percepts. That session began with presentation 
of the first (electronically fused syllables) jind third (duplex percepts) 
practice series, followed by the two test series in counterbalanced order. In 
the other session, the task was to label the nonspeech **chirp** percept as ris- 
ing or falling, while ignoring the speech percepts. In this case presentation 
of the second (isolated transitions) and the third (duplex percepts) practice 
series preceded the two test series. 

Subjects . The subjects who met the requirements of the experiment were 
ten young women who attended Bryn Mawr College. Two additional subjects were 
screened but not included in the subject pcpulation because they f iled to 
distinguish the Cba] and Cga] coit4ponents of the duplex percept* 



Results 

The data for the first experiment comprise labeling responses to the two 
components of the duplex percept, the speech percept of Cba] or Cga] and the 
nonspeech percept of a rising or filing chirp. In the first test series ve 
had systeciatically manipulated tht synchrony of the constant base and the 
variable F2 trdnsition heard in the »ther ear. The percent correct responses 
.•"or the speech and nonspeech percepts averaged across the ten subjects appear 
in FigMre 2 as a function of SOA. 



In general, subjects were more accurate in labeling nonspeech percepts of 
rising and falling chJrps than in labeling speech percepts of [ba] and Cga]. 
This difference was significant in a repeated measure ANOVA, F(1,9) = 12.39, 
MSe = 780, £ < .005- The systematic increases in SOA also had an effect on 
response accuracy, F(7|63) = 3.39, MSe = 65| £ < .OOU, Dut most importantly, 
man Jlpulat ions of temporal asynchrony had opposite effects on speech and non- 
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Figure 2. The effects of increasing SOA on labeling syllables and **chlrp" 
catego/lzatlon. 
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Figure 3* The effects of backward masking the transition by white noise on 
labeling syllables and "chirp" categorization. 
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speech percepts; F{7,63) = 8.75, MSe s 73, £ < .0001. Whereas speech percep- 
tion was best when the base and transitions were in time synchrony, nonspeech 
perception was better when the transitions preceded the base by 20 m or more. 

We turn now to the r.esults obtained with the second test series, designed 
to evaluate the effects of backward masking on each component of the duplex 
percept. The results are presented in Figure 3 where, for convenience, we 
have also gilren the results obtained with the comparable unmasked stimuli Trom 
^ the first test series. An ai^alysis of variance computed on the results sum* 
marized in Figure 3 reveal^ that while the white noise stimulus had sc^ne ef- 
fect on performance, £(1,9) = 7.59, HSe s 269, £ < .02, the more interesting 
result is that the mask penalized perception of the transitions as rising or 
falling chirps, but had no debilitating effect on the perception of the 
transitions as cues for distinguishing Cba] and [ga], F(l,9) ^ 53-^7, 
HSe ? 72, £ < .0001. Also, no interaction was found between the SOA and the 
masking effects. The contrasting influences of backward masking with wMte 
noise and manipulation of temporal synchrony on the two components of the du- 
plex percept are to be regarded as the majpr outcome of this er<periment. 

Discussion 

This experiment has shown that SOA and a white noise mask had selective 
(but opposed) effects on the speech and nonspeech aspects of the ouplex per- 
cept. While an increase in the SOA between the base and the transition 
systematically degraded speech identification when the trattsition was 
incorporated into a CV percept, SOAs beyond 20 ms had a facilitatory effect on 
the labeling of the same transition as a nonspeech chirp. In contrast, a 
white noise mask that imnediately followed the isolated transition had no ef- 
fect on speech identification, but .considerably iopaired the discriminability 
of the chirps. 

Th^> explanation of the differential effect of increasing SOA on speech 
and chir? discrimination is straightforward. SOA has an adverse influence on 
fusion, f^nd tiaph^zizes the individuality of each channel. Obviously, since 
chirp ift^r.tirication is based on one channel only, it could not be penalized 
Dy this manipulation. Moreover,, the highe^^ percent correct categorization of 
chirps with nonzero SOA might suggest release from a masking effect of the 
(lower frequency) first formant transition, which might survive in spite of 
the dichotic presentation. Since identification of Cba] or Cga] is based on 
fusion of the base and the transition, obviously SOA impaired peception of the 
syllables. We note, however, that, correct labeling of Cba) and Iga) w^s above 
chance even when the SOA between i^he transition and the base was as long as 
100 ms. Since the base was identical for both the Cba) and Cga] duplex 
percepts, any correct identification of the syllable is contingent upon the 
phonetic information provided by the transition. Nusbaum et al. (1983) claim 
that listeners may identify the isolated transitions as speech without 
integrating them with the base, ^f indeed labeling of syllables at large SOAs 
was based on phonetic categorization of the transitions, an above-chance 
asymptote in performance should have emerged. In contrast, a continuous de- 
cline of percent correct was found as SOA was increased, supporting the hy- 
pothesis that syllable identification in the duplex situation is in fact based 
on successful dichotic fusion (Kepp, in press; Repp, Hilburn, i Ashkenas, 
1983). We therefore reject Nusbaum et al.'B hypothesis, and assume that some 
fusion did occur even when the offset of the transition preceded the onset of 
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the base by 50 ms. Successful fusion loiplles that* at some point in timet the 
two stiauli were simultaneously available to the phonetic processor and sug^ 
gesta that the information provided by the leading transitions wa^i somehow 
stored in memory. The location and form of this storage is not revealed by 
the present experiment. 

The second and* perhaps* more important result is the differential effect 
of the white noiae mask on the speech and nonspeech aspects of the perception 
of the transitions. The white noise mask impaired categorization of the 
\ nonphonetic differences between the two transitions* but the different phonet* 

\ ^ ic percepts supported by them remained unaffected. This result might be ex* 

\ plained in two ways. The first is to assume that labeling in the phonetic 

^ mode* being both natural and based on highly overlearned categories* require 

\ leas precise auditory information than does labeling in the nonspeech siode* 

where the artificiality of the task and the higher amount of uncertainty re- 
quire more information to be resolved. If this were so* the mask would have 
had a general effect on the input of the auditory information interfering with 
a coniiH)n sensory » precategorical storage mechanism accessed by both the 
phonetic and nonphonetic processing systems. However* other alternatives 
should also be considered. A second ^'^ossible explanation ia that there exist 
different phonetic and nonphonetic information processing mechanisms (Cutting 
tt Piaoni* 1978). if this were so* the unpatterned white noise would have in- 
terfered selectively with nonphonetic processing. One way to discriminate be- 
tween the two explanations ia to determine whether other stimulus degradations 
penalize nonspeech to a greater extent than speech perception. This was test- 
ed in Experiment 2, 



Experiment 2 

One explanation for the differential effect of the white noise mask ijf\ 
Experiment t was based on the fSSUiiv>tion that speech can be classified into 
well-known categories* and therefore is more tolerant than nonspeech of the 
^ ambiguity induced by the tsask in the auditory stimulus. If this is so* other 
forms of stimulus degradation might have effects on speech and nonspeech 
perception* similar to those found in Experiment 1. A direct test of this hy- 
^pothesis was atteiiv>ted in Experiment 2. 

In the first description of the duplex perception phenomenon* Rand (197^) 
reported that a 30-dB attenuation of the transition relative to the base did 
not inspair correct labeling of the fused speech percepts. With 50-dB attenua- 
tion of the transition* labeliiig performance was stUI above chance. However* 
no results were reported regarding the coinparable effect of degrading auditory 
information on the nonspeech aspect of the duplex percept. If correct speech 
categorization can indeed be based on less auditory inf onr^ation than ia re- 
quired for correct categorization of nonspeech* we should expect that attenua- 
tion of the transition will impair speech labeling iess than nonspeech 
categorization. We tested this prediction by gradually attenuating the inten- 
sity of the isolated transition for subjects instructed to Ubel the speech or 
f^nspeech aspects of the duplex percept in two separate sessions. 

Method 



Stimuli . The stimuli employed in Experiment 2 were the base and the Cba] 
and [ga] transitions o/ Experiment 1 . This time* however* instead of 
manipulating the tetnporal asynchrony of base and transition* we kept them in 
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synchrony as we decreased the relative diiv>litude of the transition in step 
sizes of 6 dB from about 80 dB (the aiif>litude employed in Experiment 1) to 66 
<iB below that aiif>litude. The decreases were accomplished on the DDP-224 PCH 
waveform editing system at Haskins Laboratories. 

Test tapes . The three inspection series were as in Experiment t. jn the 
test series $ ^ duplex stimuli were presented^ with the bdse and transition 
recorded on separate channels tn onset synchrony. Each of the two transitions 
occurred eight tioes at each of twelve different amplitude levels: equal to 
the base, -6, -12, -18, -2i|, -30, -36, -42, -48, -54, -60, and -66 dB. This 
yielded a total of ^92 stimuli, which were recorded in a random^ed sequence 
with interstimulus and interblock intervals as in Experiment 1. 

Procedure 

The procedure was analogous to that employed in Experljnent 1, which had 
preceded ExperiiDent 2 by several weeks. There were two sessions, with order 
counterbalanced across subjects. In one session, the task was to label the 
speech percepts as Cba ] , Cga ] , or Cda ] . The response category of [da ] was 
in'^luded because [da] is the response most often assigned to the base in iso^ 
lation, but no [dp j responses were given by our .ubjets, perhaps owing to 
their previous e:cperience in Experiment 1 . Presentation of the first and 
third inspection series (electronically fused [ba] and [ga] syllables and du- 
plex stimuli) was followed by presentation of the test series. In the other 
sefision* the task was to categorize the ^percepts as rising or f A4ing in 
pitch, or to respond with **0** if no nonspeech percept was heard. Presentation 
of the secono (isolated transitions) third practice series was followed by 
presentation of the test series. 

Subjects . The subjects were the same ten young women who participated in 
Experiment 1. 

Results 

The pattern of results, averaged across subjects, is graphically sumnar* 
ized in Figure 4, where the solid line represents the accuracy of responses 
when subjects were asked to label their speech percepts, the dashed line re- 
presents the accuracy of response when subjects labeled their nonspeech 
percepts of rising and falling chirps, and the dotted line represents the 
percentage of the trials on which subjects did not hear any chirp at all. In 
general, the accuracy of speech perception was superior to the accuracy of 
nonspeech perception, F(1,9) s 10.85, = 840, ^ < .009, and the systematic 
decrease in transition amplitude had a penalizing effect on both, 
F(11,100) r 23.23* MSe = 130, £ < .0001. Yet, aii^)litude decreases had a sig- 
nificantly greater effect on nonspeech perception than they had on speech 
perception, £(11,99) - 3.11, MSe s i5i, ^ ^ .OOI. For exanqple, at an ampli- 
tude decrcase^'of 42 Ob, subjects reported hearing no chirps at all in 50t of 
the trials* and their categorization of those chirps that were heA'^d was only 
75% correct. Yet, at this same amplitude decrease, Ypeecu labeling was 95% 
accurate. At -48 dB, subjects were reporting a nonspeech percept in only 20X 
of the trials and their nonspeech labeling was at chance, but cheir speech 
labeling was correct in 89X of the trials. Indeed, not until an amplitude de- 
crease of 66 dB did speech perception approach a 50% level of accuracy. This, 
then, we regard as the major outcome of the third experiment: At certain am- 
plitude levels where nonspeech percepts of the F2 transitions often go 
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undetected and» ev«n when detected* are c9tcgorlze<^ at chance level3» speech 
perception conveyed the phonetic Information' In the same formant transitions 
quite accurately. 
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Figure 4, The effects of a decrease In the uDq)lltude oY the isolated transl< 
tlon on labeling syllabl«» "chi rp" detection^ and "chirp" categor- 
ization. 



Discussion 
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The results of Experiment 2 replicate those of Rand (197^) and provide 
strong support for the hypothesis that categorization of speech may be based 
on less auditory Information than Is ' required for nonspeech categorization. 
The different sensitivity of the speech and nonspeech perceptlotx mechanisms to 
the decrease In the Intensity of the transltlor>5 also suggests that they 
operate at different ] jIs of Inform&tlon prbcesslng. As Cutting (1976) 
pointed out» sensitivity to energy level Is Indicative of lower^level process- 
ing In audition I as Turvey (1973) suggested for visual perceptio5« Tne small^ 
er effect of the decrease In the transition's energy on speech than on rion- 
s;/eech perception Indicates that speech labeling is le^Sv'ensltlve to changk:s 
In enej gy levels of the stimulus and» therefore^ Is based on hlgher-*level 
perceptual processes than Is nonspeech categorization. ' 

Experiment 3 - , \ ' 

In Experiment 1 the mask was presented In the same cwnnel as the transit 
tlon. The purpose of the third experiment was to examine the etfedts of a 
white noise mdsk» presented simultaneously with the transition buc In t^e same 
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channel as the base, on speech and nonspeech perception of the dichotic stinau- 

I'JS. 

^Hassaro (l970) reported that contralateral backward roasKing of tonal tar- 
gets' by tonal DiasKs interferes vitt pitch perception as much as binaural back- 
ward masking. Dichotic backward masking effects have also been found with 
more complex stimuli, such as CV syllables (Darwin, 1971; Studdert-Kennedy, 
Jhankweiler 4 Schulman, 1970). In contrast, when the contralateral mask leads 
the target stinulus, recognition of CV syllables is less impaired (Darwin* 
1971J Studdert-Kennedy et cl., 1970), aid pitch recognition of tones is not 
affected at all (Hassaro, 1970). On the basis of these results and of binaur- 
al masking effects, it has been suggested that an auditory input produces a 
preperceptual auditory imaee that represents the information in the stimulus 
and is located centrally. The recognition process entails a readout of the 
information from this preperceptual auditory image (Massaro, 1972b), and this 
process is penalized when a lagging stimulus (the mask) occurs before comple- 
tion of the readout of the r.scessary information. This hypothesis explains 
the difference between forward and backward masking effects. 

We may assume that two similar intense stimuli simultaneously presented 
to the two ears will be simultaneously present in the centrally located 
preperceptual storage. In this case, since the white noise and the isolated 
chirp are both nonspeech stimuli, they might fuse but should not provide a du- 
plex percept. Si nce th e same "constant" (the white noise) is added to each of 
the two different chirps, we may expect that the combined noise + cnirp 
percepts would be dlscrlmlnable and therefore nonspeech categorization will 
not be' affected by the contralateral simultaneous white noise mask. A similar 
prediction was made about masking effects cn ^he speech aspect of the duplex 
percept. If indeed speech perception involves readout from the preperceptual 
image, the base, the transition, and the white noise would all interact. The 
results of Experiment 1 and 2 SH^gest, however, that s-eech information can be 
extracted quite accurately from a "noisy" ^mage, and therefore the mas, should 
not interfere with sylrable la^ellna. 

Method 

Stimuli . The stimuli were 30 ms F2 transitions separated from two -f or- 
mant aproximations of [ba] and Cga], and the base, all ^lewly synthesized on 
the Haskins software parallel resonance synt±iesizer. The base was 100 ms in 
duration, had a 25 ms anplltude raup at onset, a 100 ms amplitude ranp at off- 
set, and a fundamental frequency that fell linearly from 1.1 Hz to 81 Hz. The 
first formant increased linearly from 250 Hz to a steady state of /JS Hz dur- 
ing the first 30 Dis. The steady state of the base's F2 was at 1230 Hz and be- 

»f thA^first formanfc. The [ga] transition fell lin- 
early from 2200 to 1230 Hz, and ths [ba] transition rose linearly from 1000 to 
1230 Hz. The fundamental frequency of the transitions was identical to that 
in the first 30 ms of the .^ase. 

For duplex presentation in the control condition, a [bal or a 
[ga] transition was recorded onto one channel of a magnetic tape and the base 
was recorded onto the other channel with 30 ms SOA, following the transition. 
In the masking condition, a 30 ms segment of white noi3e iamediately preceded 
the basf (i.e., the noise was an ipsilateral forward mask with respect to the 
l)a3eri>ut^"3tglttan euus uu i ili dlat era l mosk ^^ith respect to the chirp.. 
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Test tapes . A screening tape and a test tape were prepared* The screen- 
ing tape contained five series of stimuli^ starting with the full Cba] and 
[ga] syllables repeated ten times* followed by ten-^lternatlons between the 
two syllables* In the second series the F2 [ba] and Cga] trans:vtlons occurred 
In a sequence identical to that for the Intact syllables. Following the 
Isolated transitions , the duplex percepts were Introduced In the following 
order! First * the base was paired with the [ba] transition and recorded 
dlchotlcally five times* Next, the base was paired with the [ga] transition 
ard recorded dlchotlcally five times. These two blocks were repeated once* 
followed by ten alternations bctwee.i the [ha] and the [ga] duplex stimuli. 
The final two series orj the screening' tape were two different randomizations 
of 30 [ga] and 30 [ba] duplex stimuli. The Inter stimulus Interval (ISI) 
throughout the tape was 2.5 seconds. 

The test tape comprised four different randomizations of 20 [ba] and 20 
[ga] duplex stimuli. Two series constituted the control condition, a d two 
the masking condition* The ISI was 2.5 seconds throughout t^e tape. 

Subjects . The subjects were seven out of 12 female subjects screened f%r 
participation In Experiment ^. Their only experience with listening to 
synthetic speech was In Experiment 4 (which was run first and they were cho* 
sen only tor reasons of availability. We shall describe here the pretesting 
procedure, by which the subjects were screened for both experiments. 

y^ e n ty ^ ai x — volu n t eerg^-vyere— pretested In groups of <wo to four. In two 

sessions separated by at least ^8 hours. In the ^speech^ session* subject^ 
were first presented blnaurally with the series of [ba] and [ga] syllablest 
and asked to label them with no restriction whatsoever. All subjects reported 
that they perceived speech* and used the labels ^ba* * ^da** ^ga** and *ya* to 
describe their percepts. The blocked duplex stimuli were then presented, fol- 
lowed by the first list of 6o randomized stimuli. The subjects were Instruct- 
ed to labei eac*i of t^e 60 stimuli using their own notation. In the **non- 
speech^ session* the ^chlrps^ were presented flrsb and the subjects were re* 
quired to df.scrlbe the two different percepts* None of the subjects perceived 
the chirps as speech. The notation was then proposed for the rising chirp 
([ba] transition), and the notation for the falling chirp ([ga] trarsl* 
tt^n). The blocked dupleJc stimuli were then presented again* and the subjects 
were Instructed to vlabel the chirps this time* and to Ignore the speech chan* 
nel. Finally^ the seeond series of €J randomized duplex stimuli was given* 
and the subjects again labeleci the chirps. Only subject who were correct on 
at least ^0 out of the 60 trials In both the speech and the nonspeech pretest 
sessions were Included In tho experiment proper. 

Procedure . Since most stimuli were labeled by all subjects as tba] or 
[ga3v^ the speech responses during testing were restricted to those two cate* 
gorles. Subjects were tested In two sessions* separated by at least ^8 hours. 
In one session they labeled the speech percepts* and in the other they cate* 
gorlzed the nonspeech percepts as rising or falling chirps. The order of the 
sessions was counterbalanced among subjects. Each session h^^m with a re* 
view* using the screening sequences of syllables and duplex stimuli for the 
speech sessions* and Isolated transitions and duplex stimuli for the nonspeech 
sessions. Presentation cf one randomization for the control and one for the 
maaklng^conditlon followed; with the order counterbalanced. 
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Results 

As can be seen In Figure 5, In the control condition speech was correctly 
labeled 75S of the time, and nonspeech categorization was correct on 76S of 
the trials. In the masked condition , however , speech perception was signify 
Icantly Impaired, being labeled correctly on only 62X of the^ trials, whereas 
correct categorization of nonspeech remained nearly constant at T^i, This 
condition by perceptual mode Interaction was supported by an AMOVA with re* 
peated ^ asures, F(l,6) = 12.4, HSe = 16.2, £ < .01. 
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Figure 5. The effects of forward masking the ''base** Mith whlti^ noise on 
labeling syl7.ables and "chirp" categorization. 

Dlscusslr^n 

The speech labeling performance In the control condition replicated the 
res*j.lts of Experiment 1 at 30 ms SOA, while categorization of unmasked non* 
speech was somewhat poorer than expected. This, however, had the advantage 
that both modes of perception in the control condition were comparable In lev^ 
el of accuracy. When a white noise mask was presented In the speech channel 
preceding the base, nonspeech perception was not affected while speech label- 
ing was significantly reduced. Thus, simultaneous dlchotic masking of the 
transitions was not effective for nonspeech percepts. Cooqp&ring these results 
with the damaging effect of the manotlc backward masking obtained In Experi- 
ment 1, and with tho effective dlchotlc backward masking obtained by others, 
we assume that In the dlchotlc slmultarjeous presentation, the transition and 
the white noise were Integrated Into one pr^perceptual Image as predicted by 
Massaro's model, and that this Image containea sufficient Infor^natlon to sup- 
port Identification of rising vs. falling chirps. 
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The masking effect of white noise on speech was not predicted. In line 
with the results of Experiments 1 and 2, which suggested that speech percep* 
tion is relatively tolerant of stimulus degradation > we expected no white 
noise masking of speech, regardless of the channel. Given the reduced forward 
masking effects reported previously, and the redundant nature of the informa- 
tion in the base, it seems un.Ukely that the decrease of speech perception 
performance in this experiment was caused by proactive effects of the white 
noise mask on the base. However, forward masking effects of the white noise 
on the F1 transition in the base that might be critical to perception of any 
stop consonant cannot be excluded. It is also possible that the fused image 
of the white noise and the truns*tion that was available in the preperceptual 
storage when the base arrived, ndght hrve been qualitatively di f ferent from 
the image of the original transitions and, therefore, less likely to fuse and 
to provide phonetic infomation to the base . In any event, addition of the 
white notse altered the recovery of phonetic information in CF1 or F2) transi- 
tions but did not penalize . recovery of nonspeech information. We cannot, 
therefore, sustain the conclusior of Experiment 2 that the £p;^ech perception 
system, in general, does not require as precise auc!itory information as does 
the nonspeec*) system. Tiathcr, we should assume tr*3t it has different require- 
ments and different sensitivities, and therefore comprises a different mode of 
information processing. This assunptlon was further investigated in Experi- 
ment 

Experiment 

The fourth experiment was designed to compare the effects of two differ- 
ent backward masks on rpeech and nonspeech perception: a white noise mask, 
similar to that used in Experiment 3» and an F2 transition derived from a 
.ynthetic approximation to the syllable [da]. The comparison provided a test 
of the two explanations have offered to account for the selective effect 
that the raonotic white noise backward ma^.. iiac on the nonspeech aspect of the 
duplex percept (Experiments 1 and 2). The first assumed that speech percep- 
tion may require less information than nonspeech. The second assumed diffei — 
ert perceptual processes for speech ard for nonspeech, which are sensitive to 
different aspects of the auditory stimulus. 

If some aspects of the auditory stimulus are used by the phonetic percep- 
tion mechanism to s^port speech identification, vhile different aspects of 
the same stimulus are used for nonsfeech categof'ization, backward masks that 
contain different amounts of phonetic information might affect the two 
perceptual modes differently. A white noise mask that provides only limited 
phonetic information should have ]ittle effect on speech perception but, as in 
Experiment 1, should effectively mask the nonspeech percepts. On the other 
hand, a [da] transition that is a potential cue for a phonetic percept may 
more effectively nask speech because it provides phonetically patterned infor- 
r^ation. 

A second means by which Experiment ^ provided a test cf the above men** 
tioned hypotheses was by examining the influer ^ of the difficulty of the task 
on backward masking of speech and nonspeech ;he different masks. Thus ^e 
compared the relative effects of each mask in a labeling task, similar to the 
task used in F^xperlments 1 and 2, and in a discrimination task using the AXB 
paradigm. In the AXB paradigm, no perceptual predefined categories are neces- 
sary. *llthough the labeling of auditory percepts may facilitate their s*,orage 
In short-term memory, labels are not essential to accurate discrimination pei — 
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formance (i.e., deciding whether X i$ A or B). Therefore, we assumed that in 
the discrimination task, speech and nonspeech perception would be similarly 
tolerant of ambiguous information, and expected that if .^elective backward 
masking effects on speech and nonspeech labeling are dt;e to the overlearning 
of speech categories, they would be less salient in the AXB pdradigjn. Obvi* 
otfsly, since speech perception may be both sensitive to different aspects of 
the auditory stimulus and require less information due to overlearning, a 
three-way interaction between the required perception mode, the nature of the 
mask, and the difficulty of the mask might exist. 

Method 

Stimuli . The stinuli for this experiment were two-formant approximations 
of the syllatleii [ba] and [ga], newly synthesized on the Haskins software 
parallel resonance synthesizer. The duration of each of the two transixions 
was 30 ms, and the duration of the base was 175 ms. The fundamental frequency 
of the base decreased linearly from 11**. to 81 Hz. The first formant rose from 
250 Hz to a steady statue of 765 Hz. The steady state of F2 was at 1230 H^^. 
The Cga] transition went from 2200 Hz to 1230 Hz, and the [ba] transition went 
from 1000 Hz to 1230 Hz. The fundamental frequency contour of the tran: itions 
matched those of the first 30 ms of the base. 

There wore two masking conditions, involving a white noise masik and a 
[da] chirp mask, ^nd a control condition in which no mask was presented. The 
SOA between the transition and the* base was 30 ms for all trials in all i^ondi- 
tions; thus the transition offset coincided with the base onset, and the 
masks were simultaneous with the first formant transition of the base. In the 
white noise mask condition, a 30 ms segment of white noise immediately fol- 
lowed the rba] or the [ga] transitions^ In the [da] chirp mask condition,. the 
transitions were immediately followed by a ^0~m3 F2 transition separated from 
a two-formant synthetic [da], which had the same base as our test stimuli. 
The [da] transition fell from 1600 to K'30 Hz, ana the fundamental frequency 
and amplitude contour were identical with those of the [ba] and [ga] transi- 
tions. 

Tape s. A screening t3i)e and two test tapes were prepared. The screening 
tape contained five series of stimuli as described in Experiment 3* 

The labeling test tape conprised two different series of 120 randomized 
duplex stimuli (60 [ba] an:i 60 [ga]). In each series there were 40 control 
trials and ^0 trials" for each of the two masking conditions. In both the con- 
trol and masking conditions, the ba-transition was presented in 20 t-ials and 
the ga^transition in the other 20 trials. The ISI was ?.5 seconds throughout 
the tape. 

The discrimination test tape included tv»%. difff*rent randomizations of 60 
test trials, each preceded by eight practice trials. Every trial consiiited of 
three stimuli in AXB design, where the second stimulus, X» was identical ei- 
ther to tl.e first stimulus. A, or to the second, B. The A and B stimuli were^ 
respectively, the [ba] and [ga] duplex stimuli used in the labeling t.^st. 
Five trials for the control condit^on and for each masking condition were useJ 
in each c3.1 of a counterbalanced design, AAB, ABB» BBA, and BAA. Within each 
trial all three stinwU were taken from the same condition. The intei stimulus 
int?*rval w^s 0*5 sec and the intertrial Interval waa 3 50c. 
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Procedure 

The screening procedures used for selecting subjects in this experiment 
were described in detail In the Method section of Experiment 3. 

Following the screening, speech ^nd nonspeech duplex perception were 
tested in separate sessions at least two days apart. Speech p|^rceptlon was 
tested first In half of the s^jbjects, while the other half started with non^ 
speech. Each session started with a '^refresher'^ series of 30-40 stimuli, in 
which the subjects practiced labeling the syllables or th6 chirps. Then they 
were warned about the presence of the masks, and instructed to laoO the sti* 
mull using only the [b^3 an<i [ga] labels. The subjects itfere enco^jraged to 
guess each time they were not sure about their percept, ihe labelin;^ te^t was 
always given first, followed by the discrimination test. In the dlsc^iiLina^ 
tlon test, tne subjects were Instructed to determine whether' the first or t(*e 
last stimulus in each set of three was the '^odd one out.** They were then given 
eight practice trials (24 stimuli), followed by the test sequen^ie Itself. 

Subjects . The subjects were 12 paid female students screened out of 26 
volunteers. They had little or no previous experience listening to synthetic 
speech. f 

Results 

It is noteworthy that during the screening tests all subjects heard the 
synthetic speech sounds as speech, wlthcut any pron^ting by the experimenter. 
Also, all but two subjects heard the isolated transitions as nonspeech. 
<These two subjects were not run in the experiment.) All 12 subjects tested 
in the experiment proper described the discrimination task as being consicer- 
abiy easier than the labeling task. 

Figure 6 presents the percentages of correct responses for speech and 
nonspee<^ in each t^J,: for the control and masking conditions. A three-factor 
analysis of variance revealed that performance in the speech mode overall was 
slightly better than in the nonspeech mode* I<1f11) = **.82, MSe = 283, £ < 
• 051 , and that performance was better in discrimination than in labeling, 
F(1,n) 5 8.00, HSe 5 371, £ < .017. More correct responses were given in the 
control than in the masking conditions, as revealed by a main effect of condi- 
tion, F;(2,10) 27.15, HSe = 138, £ < .001, Although this main effect was 
largely due tj the white noise mask condltlont a post-hoc comparison of the 
control ard tie Cda] chirp mask conditions turne<i out to be significant as 
well, 5 6.89, MSe = 73, £ < .024. Most importantly, speech and non- 

speech perception were differently affected by masking as. reflected in £> slg^ 
nlflcant interaction between the two factorr, 10) - 6.88, MSe ^ 48, p < 
.Clil. Post-*oc analyses of this interaction revealed that for both categorl* 
^.ation and dlscrlmln'^tlon, the white noise mask penalized nonspeech perception 
more than it penalized speech, while con^sred with the control condition ^n 
each mode, overall, the [da] chirp mask impaired the perception of speech more 
effectively. The three-way interaction was not significant. 

Discussion 

Better performance was found in the discrimination than in the labeling 
task for both speech and nonspeech perception, validating our manipulation of 
task difficulty. However, the effect of the white noise mask on both speech 
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Figure 6. The effects of backward masking the transition uXth white noise and 
with a different transition on labeling and discrimination of 
speech and norspeech. 

and nonspcech perception was Identical for both discrimination and labeling of 
the stlDwll. In contrast to Experiment 1 , speech perception seemed to be 
somewhat iirqpalred by the Mhite noise mask, but It was significantly less 
affected than nonspeech perception. Since the selective effect of the back- 
ward white noise mask cn nonspeech perception was obtained In both tasks* the 
results do not sopport^'the Interpretation that the specific effect of the 
white noise mask on nonspeech categorization Is due to the greater difficulty 
of nonspeech proces^ilng. Rather* we should look for differences In processing 
phonetic and nonphonetlc Information that might account for the results. 

In contrast to the white noise mask* the effects of the [da] chirp mask 
are less clear. Since the CdaJ transition was synchronous with the base* one 
might expect that this transition would fuse with th« base* generating a CdaJ 
percept on most trials and reducing any Distinction between CbaJ and CgaJ 
percepts to chance level. Figure 6 Indicates* however* thaJt while the [da] 
transition reduced correct labeling of speech by 7.79(* correct discrimination 
was reduced by only 4*3S. It seems that while the white noise mask has a sim- 
ilar effect on discrimination and categorization of both speech and nonspeech* 
the [da] chirp mask Is more efficient in masking speech labeling than speech 
discrimination* This trend suggests that for speech* the two task? might have 
drawn on different strategies and auditory cues. 
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General Dlscussl^'^n 

The present study investigated some dl fferences between speech and non- 
speech perceptual irtodes. Strong additional support was provided for the hy- 
poth'^'sis that phont'tic and nonphonetic inforinatlon are processed differently 
in the auditory system and new Information ab6ut the nature of thi^ difference 
was reported. 

In Experiment 1 we demonstrated an interesting dissociation between 
speech and nonspeech perception of a transition presented in a duplex percep- 
tion context: A burst of white noise following the transition effectively 
masked the nonspeech aspects of the percept» while leaving intact the phonetic 
aspect;^ necessary to support perception of a stop consonant. Two possible ex- 
planations of this phenoBwnon were investigated In Experiments 2» 3f and 4. 
The first, which a^suoies that speech perception is more tolerant of ambiguous 
au'^itory information (since it uses well-learned categories), was apparently 
supported by the results of Experioient 2. In this experiownt we showed that 
speech perception performance Is above chance even when the chirps are no 
longer detected. However, the results of Experiment 3 revealed that when the 
white noise mask was presented contralaterally but simultaneously with the 
trans it5,on, speech perception was iflq>aired* We have argued that the white 
noise could have combined with the second formant to form a hew stimulus that 
was less effective in supporting phonetic perception, or could have masked 
some critical part of tne base, such aa the F1 transition. The first 
interpretation inplies that the speech perception system not only is more tol* 
erant of ambiguous auditory Input (although it might certainly be so), but al- 
so that it is not as sensitive to energy manipulations as it is to certain 
informational aspects of the stimulus. This hypothesis was tested in Experi* 
Bient 4, in which the masking effect of the white noise on speech and nonspeech 
perception was coiq)ared with the masking effect of a different transition. 
Although not entirely conclusive, the results of Experiownt 4 supported the 
following hypothesis: A [da] transition masks labeling of [ba] and Ega] syll- 
ables slightly more than white noise, while nonspeech perception is penalized 
signiE^icantly mofe by ^hite noise than by a transition. 

Our finding of successful fusion and accurate CV perception despite SOA 
between the transition and the base replicates the results of previous stud* 
ies, and implies the existe;tce of a storage loechanism where the the transition 
is still available when the base is apprehended. It is with reference to this 
stage that we will try to explain how the different masks influenced percep- 
tion of second formant transitions in the speech and nonspeech modes. First, 
let us consider the effect of backward mt^Jking. The white noise backward mask 
interfered with the readout of auditory, nonphonetic information from ^he 
perceptral storage, thereby penalizing identification of the rising and fall- 
ing chirps. In this case, the effects of the. white noise mask were relatively 
greater than those of the [da] mask because the former was relatively greater 
in intensity, and we should expect intensity to be of primary relevance in 
preperceptual, lower^level processing in audition (Cutting, 1976), as well as 
in vision (TUrvey, 1973). The white noise raask^ however, did not interfere 
with phonetic processing of the transition nearly as extensively as it inter- 
fered with nonphonetic processing. P^^rhaps phonetic perception is more resis- 
tent to backward masking because the phonetic processor, once tr\ggeted by 
some as yet undefined aspect of an acoustic syllable, 15 uniquely ±:uited to 
recovering information in formant transitions that is pertinent to place of 
articulation. In any case, the particular type of processing involvedi in 
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perception in the speech mode would see") to be more vulnerable to the effects 
of the [da] mask than those of the white, noise mask. This coii^id occur because 
the [da] maak also proirides information about the place of articulation, which 
coinpetes with that provided by the [ba] or [ga] transition. In any event* the 
marking of phonetic Information in the transition Is not particularly vUlner* 
able to differences in Intensity between, the [da] ma:ak .and hhe white noise 
mask, perhaps because those differences are not relevant to the perception of 
place of articulation. 

i 

Turning row to those results obtained with simultaneous masking* we have 
fDund that wnen a white noise mask is presented that is simultaneous and con* ^ 
tralateral to the transition* itrhas the opposite effect of penalizing phonet- 
ic perception* but not nonphonetlc perception* Here th^ greater resistance of 
nonphonetic perception could be owing to the fact that the nalse did not re- 
place the transition in preperceptual acoustic storage* but merely became 
Integrated with it* and hence )jid not cause a premature cessation of the rele- 
vant auditory readout processes. Considering now the penalizing effect of the 
noise on the speech percept* two explanations have occurred to^ us. One is 
that when the transition and the white noise co^ccur* they fuse* and the 
resulting stimulus is less likely to provide phonetic information* which would 
be fused with that provided by the base. Another is thet the white nolne 
obliterates some ,r it leal aspect of the base* such as the f Irst^f ormant 
tran sltlon* which may be a critical cue to stop consonant manner and thua 
essential to the assigned task of identifying [ba] and Cgal. 

The results of this study further provide some insight into how* and at 
what level of information processing* speech is recognized as such* and starts 
to be processed differentially. Given that once duplex perception is 
achieved* the base is not heard in and of Itself* we are led to entertain the 
possibility that the storage mechanism is preperceptual* although certain data 
su^T^st that it cannot be based on transmission channels (Massaro* 1970). One 
solution 13 to adopt Maasaro's notion of preperceptual central Images and ex- 
tend it to Include auditory cues necessary for phonetic perception. ^Phonet- 
ics auditory information * stored in the preperceptual storage buffer would not 
yet be identified as speech. Rather* ltd ultimate phonetic perception wo*jld 
be contingent upon the activation of a central mechanism. Confining ourselves 
to the present concern of the distinctions between phonetic and nonphonetlc 
per'^eptlon of second formi^nt transitions* we would conclude by suggesJ:.lng* in 
line with previous processing theories (cf. Turvey* 1973)* that speech percep- 
tlo''^ Involves activation of a central mechanism* while nonspeech perception is 
more dependent on peripheral auditory processes. We suggest* therefore* that 
perception of speech* as a distinct process* starts when the auditory informa- 
tion reaches the cencral nervous system and "turns on** a special' perceptual 
mechanism. 
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VOWELS IN COMSONANTAL CONTEXT ARE PERCEIVED MORE LINGUI/TICALLY THAN ARE 
ISOLATED VOWELS: EVIDENCE FROM AN INDIVIDUAL DIFFERENCES SCi LirJG STUDY* 



Brad Rakerd'^ 



Abstract . The purpose of this ^/iveiitigatlon was to determine wheth- 
er the presence of neighboring consonants can exert a contextual in- 
fluence ,on vowel perception andt if so, to characterize the influ* 
ence. Two experiments were carried outi^ toward that end. In both* 
subjects were asked to Judge the !..irigui5fic similarity relationships^ 
that held aioong a set of American English vowels when those vowels 
occurred either: (a) in isolation! or Cb) in /dVd/ consonantal 
contexts The Judgment^ were made in response to recordings of 
natural speech in Experitoent 1. In Experiinent 2, th^y were made for 
subjects* memorial images^ of vowels as elicited by whitten stimuli. 
Individual differences scaling of the outcomes of the two experi- 
ments provided evidenc>e that support e^^ the following conclusions: 
(1) consonantal context can significantly influence vowel percep- 
tion* (2) for the /dVd/ context at l^^ast^ the nature of the influ- 
ence is to evoke more linguistic perceptual processing of vowel.^ 
than occurs when they are presented in isolation; (3) the influence 
is more likely to be explained in terms of properties of the stimuli 
presented to perceivers than in terms of any sort of knowledge that 
perceivers bring to bear in perceptual processing; . and W three 
features of linguistic description for vowels— ^advancement! height, 
and tenseness-^ave particular import for vowe?. perception and for 
voWel memory. ^ 

It has' long been recognized that the acoustic correlates of a vowi'l can 
vary» sometimes to a substantial degree, depending on the identity of th« con- 
sonants that precede and/or. follow it (e.g*. House & Fairbanks^ 1953; Lind- 
blom» 1963; Stevens & House» 1963)* This variation has come ta be undet stood 
in terms of the fact that a talker often coarticulate^ the neigKboring seg- 
ments of an utterance (th^^t is» overlaps their respective productions) such 
that the acoustic signal is Jointly influenced by those segments (e.g.» Li^er- 
mant Cooper» Shankweiler, & Studdert-Kennedy^ 1967)* How, then^ do vowel 
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perceivers adjust to these acoustic variations? One possibility is that in 
many or mo3t cases they dc not. If the variations are sufficiently tninor, a 
pe. .^^iver could simply *^ignore^ them and achieve an acceptably high level of 
performance in id<intifying vowels. Alternatively, the perceiving of vowels 
mi^t in vol v«! certain context-sensitive perceptual strategies, analogous to 
those that are generally thought to be required when listeners identify the 
consonants of aii utterance (for reviews of the evidence regarding consonantal 
perception* see, e.g., Liberman, 1982; Llberman et al., 1967; Llberman i Pi- 
-.oni, 1977). 

It becomes important to determine whether or not vowel perceivers are 
sensitive to consonantal context because (unlike most consonants ) vowels ean 
be freed from the influences of their neighboring segments and produced as 
isolated utterances. A good <)eal is known about the perception (and produce 
tion) of these isolated vowels and there is an in4>ortant issue as to how to 
generalize from that knowledge to other cases. Tc the degree that listeners 
are, in fact, Indifferent to the acouttic variations engendered by consonantal 
context. Isolated productions might be taken to be the canonical vowel form> 
and their acoustic signature to be the one th^t best exhibits the .-ssential 
information for \owel perception. However, if perceiving of vowels does, 
in general, involve context-sensitive strategies, then the isolateo vowel form 
is but one of many variants, and, arguably, one of the least representative 
variants because it occurs infrequently in natural' speech. An answer to the 
question of wnether isolated vowels ^re perceived differently than vowels in 
consonantal context therefore proves to be basic to vowel research. 

'I 

Pr*evious efforts to answer this question have generally been ^ased on 
cojnparisons of the identif lability of vowels in and out of some consonantal 
frame. Evidence gathered with this method has, .in a number of instances, been 
taken to favor the view that consonantal context can signficantly affect' vowel 
perception by exerting a positive influence on vowel identification (Gottfried 
i St^ange, 1980; Strange, Edman, i Jenkins, 1979; Strange, Verbrjjgge, Shank- 
weiler, i Edman, 1976). This finding remains a subject of debate, however. 
It had not been observed in all studies (Macchi, 1980; Pisoni,M979) and it 
has been challenged on grounds of being largely an artifact of the method of 
assessment (Assmann, Nearey, i Hogan, 1982; Diehl, HcCr^ker, i Chapman, 1981; 
but see Rakerd, Vcrbrugge, i Shankweiler, in press; Strange i Gottfried, 



The present study complements this work by addressing the question of a 
consonantal influence on vowel perception with evidence of a different kind 
than has been offered in the past. To begin with, the data collected uere 
were Judgments of vowel similarity rather than absolute identification Judg- 
ments; hence» chey assess the consonantal influence with a n^?w perceptual 
measure. Hore iinportantly, the r^^tiulting data were analyzed with an individu- 
al differences scaling technique that highlights aspects of the o^ta structure 
that have not been considered previously. Those aspects are: it\ the dimen- 
sions of perception that had some shared significance for the set of subjects 
as a whole; and (2) the relative salience that those dimensions had for the 
individual subjects depending on whether they Judged vowels in or out of con- 
text. It will be seen that both aspects of the scaling solution were informa- 
tive about the nature of vowel Perception. 



1980). 
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Experiinent 1 

The starting point for an individual differences scaling analyses of 
Vowel perception is to collect^ for each subject* a matrix of what Shepard 
(l962at l962b) has called ^'proximities*' data. These are data indexing the 
network .of perceptual relationships that hold among a set of vowels. A triad- 
ic coQtparisons procedure was eotployed to collect those data in the present 
experiment. That procedure was chosen because it had prov.n useful in previ- 
ous vcMel research. H>re than a decade ago* Pols and his associates (Fols^ 
Tronq), t Plomp, 1973! Pols, van der Kamp* & Plomp, 1969) assessed the per- 
ceived vcMel qualit*' of spectrally ^constant speechlike sounds by^ requiring 
that subjects compare triplets of stimuli on a trial. Specifically, subjects 
were required to judge which two members of triplet sounded most alike to 
them and ^hich two least alike\ They then proceeded to a new triplet and* 
over trials, judged all possible stimulus combinations. Thi5 procedure yield- 
ed reliable data that were interpretable* both from a lingjistic standpoint 
and with respect to acoustic properties of the stiOMili . Others (Singh & 
Woods, 1970; Terbeek & Harshman, l97l ) have since employed the triadic 
coQtpariso'ns method in vowel perception studies and obtained equally satisfac- 
tory results. It was used here to coDq)are the perception of isolated vowels 
with that of vowels in a consonantal context (/dVd/). 



Method 

Subjects . Twenty-three subjects, randomly selected from a pool of indi^ 
viduals; registered with the Haskin? Laboratories in New Haven, Connecticut, 
were paid to participate in Experiment 1. All of them were native speakers of 
English, and none had any history o^ hearing difficulties. It was ensured that 
they had no prior knowledge of the purpose of this study or the design of the 
experiment. Twelve of the subjects were assigned to the isolated-vowels 
condition of the experiment, eleven to the consonantal*context condition. 

Stimuli . The stimuli were nat;:ral productions of ten ^ American English 
vowels: l/i, i, £, a* a, a, o, o, u, u/. A single male talker, who spoke a 
General [American Dialoct* recorded these vowels in each of( two contexts: (1) 
in the trisyllable frame /hadVda/, whfere the second syllable (/dVd/) was 
stressed; and (2) in isolation. The /dVd/ consonantal trave was chosen be- 
cause it inq)osed certain coarticuiatory constraints on the talker. In order 
to produce initial and final /d/ consonants, the jaw musl be closed and the 
tongue tjip sealed against the back of the teeth. Articulation of the syllable 
vc el, ifhich likewise requires an appropriate parameterization of the tongue 
ar * jaw*; must therefore be coordinated with^that of the consonants. Presum- 
ably owllng to these coarticuiatory constraints, there often is a substantial 
degree if acoustic modulation associated with /dVd/ syllables;. The stressed 
.arget syllfbles w^re flanked by destressed syllables (/ha/ and /^/) to ensure 
th thej conSonantal*^ontext stimuli would not be meaningul Hord3 in English. 

While seated in a sound-attenuated room* the talker produced several to- 
kens of 'each vcft*el in each context. These productions were tape recorded, 
low-pa39 filter'ed at 5 kKz» digitized at a saDq)ling rate ofMO kilz, and stored 
in separate computer files. Two of the tokens of each vowel were used in the 
experiment. In all cases, these were the first two tokens produced unless 
some sort of articulatory anocsaly such as vocal fry or ^'breathiness*' was audi- 
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' ble. When an anomaly was heard in one of the first twp tolcens, it- was re- 
placed by the third* and vif thst was anomalous by the fourth and sp on. 
Acoustic analyses revealed that the stimuli selected by this procedure were 
acoustically **normalv" in that their spectral ind tfii^oral characteristics 
were sQch as might be expected on the basis of data ^ eported by previous 
investigators (e.g., Lehiste & Peterson* 1961; Peterson & Barney* 1952). - 

Procedure 

Instructions . At the outset* it was explained tc subjects that tt^e . task 
^. would be to compare tlieir perceptions of several different vowel AOLinds.' It 

was also explained that they were to base the co:4)arison on linguistic aspeots 
oC those sounds. The individual subjects werr. left to ilefine their own cri^ 
terla fcr tiie linguistic aspects. They were* however * given the following 
l^jcample as an aid: 

j If a child and an adult ^lere both to say Ihe vowel /i/ or the word /did/r 
you would surely hear some differences between the vowel sounds^ The 
\' child *s vowels would doubtless be soft^sr, higher in pitch, and so ci. On 
^ the other hand, the /i/ vowels produced by the child and the adult would 
also have something in coraDon, a quality or qualities that distinguished 
them from other vowels like the /e/ in /ded/ or the /i/ in /difl/. These 
are the qualities that you shoul(} attend to in this experiment. 

Tri a dic comparisons . As indieate<{ ^farlier, th3 specific task set for* 
subjects was that of comparing triads r/ stimuli. On each experiment^dl trials 
three voweliT were randomly sel^cteO for prf; 3ritation from the set of\en al- 
ternatives viith the constraint that the particular triad chosen had not oc^ 
curred on any previous trial. A -.ubject was allowed ^'o listen to these thre<* 
vowels in any order and any numoer of times with the' goal of reporting which 
twQ pf the three sounded mo st alike and which two xeast alike. Over the 
course «jf the experiment^ listeners Judged all possible triadic combinations 
of the, ten vowel alternatives (120 possibilities).. Note th!it this meant that 
every vowel pair was* over trials, Judged in relatior to every other vowel in 
the set. 

Data were accumulated over trials according to the following scoring pro- 
cedure: vowel pairs Judged most^alike were assigned ^1 score?) and those 
Judged least^like scores. In thi5 way, a matrix of dsca indexing the per- 
ceived relationships among the ten vowel alternatives \fBs ob.^^ined for each 
subject. The matrix for a subject who rated vowels in consonantal context is 
jshov'i in Taol^ 1 for purpose? of example. 

One of the virtues of the triadic-comparison:* procedure was that it 
placed minimal memorial demands on subjects; since only three stimuli haj to 
be dealt witt) on each trial and thes^ could be played and replayed in any ord- 
er as needed. A second virtue was that the self-paced nature of the procedure 
min.mized the time pressure felt by subjects. ^ 

Famlliarizatioti with the equipment and procedures . A complete testing 
session took about two hours. Rcojghly thirty minutes of that time was devoted 
to . ami liar izing subjects with the equipment an^j procedures used to pre^v'^nt 
the stimuli ard record the responses. 

\ 
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The subjects were tested individually. Each was fitted with headphones 
and seated in front of a computet^ terminal that was housed in a 
sound-attenuated ;rooin. Three of the keys on the terminal triggered presenta^ 
tion of the appropriate stimuli for each triadic trial. After a key was 
pressed* the corresponding stimulus was presented through headphones at a 
comfortable listening level. The most-alike and least-alike response choices 
were entered via a different set of keys on the terminal. Once these choice^ji 
had been made* the system advanced to the next trial. 

The equipiuent and testing procedure ffere demonstrated to subjects over a 
series of training trials. There were between 15 and 25 such trials depending 
on the individual. Each training trial comprised a different triadic combipa- 
tion of stimuli sampled from the ^et of 120 posQibilities. For the first few 
such trials* the experimenter operated the equipment* <J^recting the presenta- 
tion of stimuli and entering response choices. After that, control was passed 
to the subjects and they paced themselves. They were invited to £isk questions 
about all aspects of the procedure. The testing session was begun only ^after 
subjects had both demonstrated competence in operating the e<|Uipment and ex- 
pressed confidence about understanding the perceptual task. 

jfnJlysis of the Da^.a J 

To allow for a direct coi^pariAon^between conditto^St a single individual 
differences scaling analysis was carried out on all subject datj from the two 
experimental conditions combined. The fundamental modeling assun9)tion of 
individual differences scaling, is that when Judging the same ^et of stimulus 
items* all subjects will make reference to the same perceptual dimensions. 
Subjects may differ from one another in terms of the relative weight (sali- 
ence) that they attach to those dimensions, but the cannot differ in terms of 
the identity of the dimensions themselves (Carroll & Chang* 19^0; Wish & <:ar-' 
roll, 197^). 
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Consistent with this assuinptlorit the scaling solution has two components 
that* together* optimally account for the data structure of' the Individual 
subject matrices. The first component, called a group space » Is a model of 
what the subjects have In conmon. The axes of the group space are the shared 
perceptual dimensions, and the^se Index a set of appropriately positioned 
points representing the stimulus Hems* The second coinponent of the scaling 
solution Is a weight space , which specifies the relative salience that the 
several dimensions of the group space have for each subject. More formally, a 
subject's weight for a particular dimension reflects the amount of variance In 
her/bis data that ean be accounted for In terms of that dimension. Together, 
the set of weighty Index a subject's location In the welgtit space. 

A noteworthy property of Indjlvldual differences scaling Is that It dic- 
tates the orientation in which the scaling solution must be interpreted* When 
attempting to relate the solution to potentially relevant factors, an Investl** 
gator is obliged to make reference to the shared perceptual dimensions* These 
dimensions enjoy a priority because they are the ones that account for the 
greatest percentage of variance in the several subjects* data* The presence 
of this interpret 1 ve restriction sets Indlvldt^l differences scaling apart 
from other variants of multidimensional scaling and strengthens claims that 
the dimensions of Its scal^ ng solutioii have some psychological reality (Car- 
roll & Wish, 197^; Kruskal & Wish, 1978; Wish & Carroll, 197^). 

It has often been pointed out (e.g.'* Kruskal & Wish* 1978; Wish & Car- 
roll, 197^) that since multldlroenslonal scaling methods are Intended to aid in 
the description and understanding ^f data, evaluation of the ''correctness^ of 
various scaling decisions must be based on considerations that are substantive 
as well as statisticax. The substantive considerations have to do with such 
factors as the Interpretablllty and stability of results* the statistical with 
the goodness-of^f it between model and dat^. On the basis of these considera- 
tions, it was determined that the present data were DOst appropriately mod- 
eled: (a) in three dimensions; and <b) at the ordinal scale of measurement* 

Dimensionality of the spac e* With individualnjiffer^nces scaling, a 
commonly used index to goodness-of rflt is the percentage W variance accounted 
for (VJJ^) in the several subjects' data (s^;' for example, Carroll & Chang, 
1970).^ Increasing the number of dimensions will increase the VAF index, 
since the model har added degrees of freedom with which to fit the data. The 
gaxns tend to diminish exponentially, however* and each new Increment must be 
weighed against the substantive considerations mentioned earlier (interpreta- 
blllty and stability). Figure 1 displays the VAF function for the present da- 
ta when modeled in two to five dimensions. The exponential nature of the 
function is clear* There is a relatively l^rge Increment in VAF for the shift 
from two to t^ree dimensions* a much smaller one for the shlf^t from three to 
four dimensions, and a negligible decrement (se£ footnotes 3 and 4) for the 
shift from four to five dimensions* 

On the basis of these statistical data, at least, it appears that three 
or perhaps four dimensions would be the appropriate modeling choice* In the 
former case, 70% of the variance would be accounted for, in the latter, 72%. 

The three^imensional iiolutlon was chosen over the four-dimensional for 
two reasons; First, as will be seen shortly, all three of the dlmenslog^s are 
linguistically meaningful arid therefore interpretable* and second* they are 
stable in that they emerged, as well, from analyses of individual subject data 
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Figure 1. Percentage of variance in the perception data accounted for by 
modeling ifi from two to five dimensions. 



collected in a memory experiment (Experiment 2) and from, separate analyses of 
the two perceptual conditions of this experiment. 

Nonmetric scaling . The decision tOrmodel these data at the nonmetrtc 
(? .c* ordinal ) scale of measurement was bas2d on two considerations. The 
first was sinq)ly that the subjects* task was to make ordinal i>erceptudl judg* 
m?nts. Cn*each trial, it was required that they identify the mnst-alike and 
least *alike vowel pairs^ but it was not required th^t they quantify the 
strengths of those pairwise relationships. It is true that by summing over 
trials some quantification was arrived at, but it was felt that the ino3t con- 
servative treatment of these data was to model their ranks. 

The second reason for operating at the nonmetric scale has to do with the 
stability of the modeling outcome. The metric/nonmetric distinction made rel- 
atively little difference with respect to the present data* IxJt it greatly 
affected the outcome of modeling the results of a memory experimept (Experi- 
ment 2) that was run, in part, to clarify the findings of tie present 
perceptual experiment. This point will be discussed in greater detail later 
(5ee the section on Analysis of tti£ Data of Experiment 2). For now» it is 
enough to note that a nonmetric scaling of these data revealed structure that 
was stably present for both of the conditions of Experiments 1 and 2. 
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Results 

Group space . The group space for all subjects (lsolated*vowels and 
consonantal-context conditions combined) M shown In Fl^re 2. Dimension 2 15 
plotted against dimension 1 In the top half of thla figure, dimension 3 
against 1 In the bottom half. These dimensions are orthogonal to one another 
and can be considered Independently. To do this, It Is useful to "project" 
the points visually onto each axl^ and consider their ordering. In this way, 
It ^8S determined that each of the dimensions of the group space corresponds 
closely to 8 traditional feature of linguistic description for vowels. Dlmeo-^ 
slon 1, for Instance, corresponds to a feature linguists have variously called 
advancement, front/back, and grave/acute. After Singh and Woods (7970), the 
term advanc^toent will be used here. This feature distinguishes vowels such as 
/l,i,e,ip/ (seen to '^project" onto the lower end of dimension 1) from other 
vowels such as /a, a « o , o, u , u/ (which "project^ onto the upper end of t^e 
dimension). Dimension 2, In turn, Corresponds to what has been called the 
height or compactness feature ( heigh t/ wlll be used here), and dimension 3 to 
tenseness or length ( tenseness Is jhe term that will be used). (See, e*g., 
Hockett, 1958, Jakobson i Waugh, 1979, Ladefoged, 1971, 1975, for nompreheh- 
slve reviews of the vocabulary of Vowel feature description.) 

/ 

These features repeatedly surface when linguists try to document various 
aspects linguistic behavior*' To take Just one example, speakers of En- 
glish, though generally unaw^f:4 of It, observe a graomatlcj^l rule for vowel 
usage that respects the tenseness feature: In English, words can end with 
"tense** vowels like /I, o, u/ (there are words like "he,** **go," and "you"), 
but they cannot end In "lax" vowels like /if t^, u/ (there are no words ending 
In the vowel so^jnds hesrd In the middle of words like "hit,** "bet," and 
"book"). English speakers must be at least t^ltly ^aiware of this rule, since 
they respect It when creating new words for the language. There sh?e countless 
other Instances of linguistic behavior that Is systematically related not only 
to the tenseness feature but to the advancement and height features as well 
(see, e.g., Jakobson & Waugh, 1979). 

There ISt as well, aome evidence to support^ the claim that all of these 
features play a perceptual role (e.g.,* Hanson, 1967; SEiepard, 1972; Singh & 
Woods, 1970). The pr<^sent results are both consistent with such a claim and 
particularly compelling xn this' regard given the nature of the scaling anal/* 
sis that was employed here. Individual differences In the several subjects* 
data provided Information that allowed for a nonarbitrary determination of the 
dimensions of the .group space. Thoss dimensions shown In Figure 2 are the 
three that optimally accounted for the variance In the^ linguls)^lc Judgments 
made by the subjects who participated In Experiment 1. The fact that each of 
tho?fe dimensions. In turr, corresponds closely to a linguistic feature, 
strongly suggests that those features have some shared perceptual significance 
among speakers of English (see Rakerd, 1932, for an expanded consideration of 
tKls point). 

Weight sfice . The concern in this section will be to look at Individual 
differences In weighting of the dimensions of the group spac^, and. In partic- 
ular, at differences between the Isolated-vowels and consonantal-ccntext sub- 
jects. The weight space for all subjects Is shown In Figure 3. Weightings 
for dimension 2 are plotted against those for dimension 1 at the top of the 
figure; dimension 3 weightings are plotted against dimension 1 weightings at 
the bottom. Each 0 represents an individual from the isolated^owels condl* 
tlon, each X represents an Individual from the consonantal^ontext condition. 
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, The first thing to notice in this figure is that the X'^ ^^'^ TOre tightly 
grouped than the 0*s in both the dimension 2 — by«aimension 1 and dimension 
3 — by»dimension -'1 planes. This indic^tej that there was less variability 
among consonantal -context subjects than there w^s among* isolated^vowels sub* 
Jects. In order to assess the statistical significance of this difference, a 
variance ratio (Snedecor's F) was calculated for the three-dimensional space 
as a whole. For each condition the variance in three-sppce was determined as 
follows. First* the centroid* or average subject weight was located in the 
space. Then» the distance from this gentroi^ was calculated for each subject 
according to the Pythagorean theorem. And **inally, the average squared dis- 
tance was computed as the measure of variance. This measure is strictly anal- 
ogous to the variance statistic (sigma squared)* which is the average squared 
deviation about .the mean for a set of numbers. The difference in variability 
between the two experimental conditions was in fact^significant* F(11J0). * 
3.21,^ £ < .05. 

It was observed) then» that in ^ perceptual task in which listeners were 
asked to relate a set of vowel sounds on the basis of their litjguistic 
qualities* there was significantly greater affriement among^ individuals who 
heard vowels, in a consonantal context than there was among those who heard 
isolated vowels. It can be inferred from this that the subjects employed 
somewhat different perceptual strategies in the two conditions. This* in 
turn» suggests the need for caution in jseneralizing from what is known about 
the perception of isolated vowels to the perception of .vcMels in context* ^a 
generalization that has often been made in the past (e.g.* Chiba & Kajiyama* 
1958; Joos« 1948). Also* it would seem to warrant a methodological caVeUt 
for those who do voWel research in the future: nam^ily, that they would 'do 
well to look at vowels in consonantal corvtext. Given the iiiq>ortance of these 
implications^ it was deemed appropriate to look at the stability of this re- 
sult* and to ensure that it was not an artifact of the scaling procedure. 

Regarding scaling* it is noteworthy that part of the variability in the 
weight space reflects differences in the goodness^f-f it between the scaling 
model and the individual data. Subjects whose data were well'fit by the model 
lie further from the origin of the space (in some direction) than do tho^ 
whose data were poorly^ fit. It may be that the observed condition difference 
in - variability was* in fact* a difference with ^respect to goodness^f-flt. 
One reason for believing this was nol^ the case** however* is that* on average* 
the subject data in the two conditions were aboct equally well fit by the mod- 
el* The average VAF for thr isolated-vowels condition was 7l$* and that for 
the consonantal-context condition was 6911* a difference that did not even ap- 
proach significance. 

Whether goodness^f-f it was significantly different for the two^groups cr 
not* it was certainly a source of variation that is of limited interest here. 
Therefore* the data were transformed U> ^'factor out** its influence. A '"'ib- 
Ject's weight on a dimension is the square root K)f the percentage of variance 
accounted for by that diihension. The total variance accounted for (VAF) can 
thus be computed for any individual by squaring the weights and summing over 
all three dimensions. Between-subject differences in gck)dness-of-f it can» in 
turn* be compensated for by normalizing the data /espect to this VAF val- 
ue. The most straightforward strategy for doing this is ^o divide a subjects* 
squared dimension weights by VAF and to take the sq^jare roots of the resultant 
dividends to be the adjusted weights. 
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These new values index subjects in the weight space shown In Figure ^. 
Xt can be seen to reflect statistical compensation for goodness^f-^f it differ- 
ences between subjects, in that th«» original weights (shown in Figure 3) h^ve 
been "compressed** along lines extending out from the origin of the space. De- 
spite this con?>ensation, the condition difference in subject variability re- 
mains significant, F(n,10) = 3^.18; £ C-'.OS. 

It proves to be the cdsei then, that even when individual differences in 
the goodness-of'-f it o^ the model are factored out, there remains a significant 
difference between the two experimental conditions 4ith respect to the subject 
variability in the weight space. This findi-ng clearly supports the view that 
vowels are perceived significant Ij^ differently in consonantal context than 
out. It also hints at the nature of the difference, at least j^r the present 
experiment. The task set for subjects was to relate a number of different, 
vowel sounds on the basis of the^r linguiistic qualities. Subjects who heard 
vowels in /dVd/ consonantal context exhibite^i significantly greater agreement 
as to what those linguistic relations were ^ than did their counterparts who 
heard isolated vowels. Thus, it can I^e said that one of the effects of con-* 
text was to stabilize linguistic Judgments across subjects. 

It is useful to take note of the nature of the stability; the consonant 
tal-context subjects clustered toward the cepter of the weight space, which 
indicates that they attached roughly equal weight to all three linguistical-* 
ly-meaningful dimensions of the group space. ^ Notice that this need not 
necesSQ^rily have been the case. Between-^ubjects agreement would ^ have been 
equally high for this^ condition had the clustering occurred out near one of 
the **corners** of the ^weight space, wherwpon one or another of the perceptual' 
dimensions could have been said to predominate. It turned out, however, that 
all three^ of the dimensions had substantial perceptual import for consonan-* 
tal-context subjects. 

The situation was markedly different for the isolated-vowels subjects. 
While several .members of that group were positioned near the ^ center of the 
weight space, oiost of them were in more **extreme** locati^>ns. The data for 
this latter group were largely accounted foivin terms of perceptual s^sitivi- 
ty to JUst one ox two of the linguistic dimensions! And it should be noted 
that the one or two dimensions that predominated wjsre different for different 
individuals. Thus it can be seen that isolated-vowels subjects were net con-* 
strained to perceive the stimuli in terms of the full set of linguistic dimen* 
sions. To the contrary, they attended to the dimensions in a piecewise 
manner, while the consonantal-context subjects integrated the dimensions in ^ 
more linguistically-appropriate way. ' ^ ' 

Summary . The individual differences scaling analysis revealed two ways 
in which vowels in consonantal context can be said to have been perceived more 
linguistically ''than isolated, bowels. First of all. Judgments about the 
linguistic qualities of vowel sounds were significantly more stable across 
subjects when the vowels were in context. And second, three linguistical- 
ly-meaningful dimensions of vowels were more integrated in perception when 
vowels were in" context. 
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Dtacuaaton 

How l3 thla effect of nonsonantaX context to be understood?- It wlXX be 
argued ihat^ broadXy «peaklng» there are two oXasses of accounts t^at might be 
brought to bear to eipXaln It and that the reJ^uXts of Eiperlment 2 Xend at 
Xeast suggestive support to, one over the other. The first cXass» which wlXX 
be caJXed k nowX edged -b ased aaiountst turns on a subject's uwlefstandlng -of 
certain regularities in the occurrence of voweX categories. The secord cXasst 
caXXed stlinuXus-^ased accounts ^ turns on a* subject's sensitivity to properties 
of the stlinuXl themseXyes. ^To lUustrate the differing character of knowj- 
edge*^a5ed and stlmuXus^based accounts* severaXr exampXes of ^ach*are provided 
beXow. 

KnowXedge'^jased accounts . One pXauslbXe exanpXe of ^ knowXedge*babed ac- 
count Is motivated by the (act that in EngXlsh voweXs most generaXXy occur in 
some consonantaX context. The context condition of the tf resmt. experiment 
might therefore be expected to eng^ge Xlngulstic processing most effectively. 
Such an argument I3 encouraged by the observation that frequency of occurrence 
does positively affect performance on* a nunlber of other Indices to language 
sklXXi such the reaction time to Identify a word as beipg in one's Xexicon 
(Fprst^er & Chamber* W73). On the other hand* a weaXth of linguistic phenome- 
na resist expXanatiori in such terms. Witness» in this regard* the fact that 
readers' of 'Japanese name coXors more rapidXy when they ^re written in kapa (a 
representation of the phonoXogic structure of the Xanguage) than in kan^l (a 
XoRographlc^ representation) even though the Xatter form is seen much more of* 
ten (FeXdman A Turvey* 1980). 

A knowXedge-based account of. a rathj^r different sort couXd draw on the 
fact that certain phonoXogicai ruXes for voweX usage are specific to consonart- 
tax"^ context. One such ruXe that has aXready -been mentioned is that "tense" 
voweXs can occur at the ends of syXXabXes but "lax" voweXs cannot. A Xistener 
asked to make judgments about the lingu: stlc quality of ' voweX sounds might 
therefore ^ave a dlfficuXt time with an isoXated "Xax" voweX Xike /I/* since 
no such IsoXated voweX is aXXowed in EngXlsh. Singh and Moods (1970) advanced 
just such an argument to account for the fact that they found no evidenae^ that 
tenseness had percepTtuaX significance for Xisteners who rated the relative 
similarity of a set of isolated American English vowels. On the basi^"* of the 
present findings* however, that fal'lqre mlgb*. possibly be attributed to t^e 
fact that tnose investigators averaged their data ov^r subjects prior to scal- 
ing. For certain isolated#-vowels subjects in tht present experiment » the 
tenseness dimension was particularly salient. For others* however* it had 
little or no salience. Averaging' over all subjects^* then* coul^^wash out" 
any statistical evidence of the significance of' the tensene"* feature*^ 

Invfssti gators (Assman et al.> -1982; Hacchi* 1980) have also pointed to 
this phonological restrlctioD on Isolated vowel usage as a potential explana- 
ti<Jh for the recurring observation that vowels can be identified more 
accurately in^ consonantal context than out (e.g., Gottfried i Strange* 198O; 
Strange et aK^ 1976; 3trange et al.> 1979), This cannot explain the phenom* 
enon in full»t however^ since Strange et al. (1979) have also observed a 
consonantal influence in CV syllables (e.g.* '•be'') in wJ;ich "lax*^ vowels are 
as phonologically impermissible as they ^ould be in Isolation. 
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Whj^tever the outcome of these individual d^atesv tne tenor of this sort 
9^f knowledge-based account i3 clear: to the degree tnat listeners are sensi- 
tive (either consciously or unconsciously) to the fact that a phonological 
rule^f Englist> proscribes the occurrence of certain vow^^sounds in isola- 
. tion* those listeners* linguistic Judgments may be affected. 

ItSceiTtlyi it has been shown that a knowledge of how speech sounds are 
wri4;ten may have an effect as well. For instance^ listeners will more rapidly 
detect the rhyming quality of spoken words when.^tho5e words are spelled alike 
(e.g.,* "f ight"/"right") than -«^en they are not ("you"/"two**) (Seidenberg & 
Tanenhdus, 1979). It is perhaps relevant* then, to note that the su>>Jects in 
the present experiment were literate- and ^.therefore had had a great deal of 
expe^'ien^e in reading and writing vowels. - *In \at least one previous study 
(Diehl' et al.; 1981) it has been suggested that such expei^ier^e can lead to a 
perceptual bias in favor of consonantal context. 

Know ledge-based accounts of the consonantal influence have in cep^n the 
fact that they look to a subject's long-term experience with stimuli of a 
Particular type*^^ Stitch accounts would ha,ve it, for eicample, that extended 
acquaintance with iVequently occurring items, or^with certain phonological or 
orthographic regularities regarding those item$,y^explains the perceptual ef- 
fect that was observed in Experiment 1. Thus, the *^locus** ^f knowledge^ased 
effects tis at some remove from immediate stimulation. 'That is to say, these- 
accounts h^ve much more to do with the sorts of accumulated knowledge that 
'might be brqught to bear in processing stimt^lus information than they do with 
the information itselt\ Sot so the more stimulus-based accents \hit will now 
be considered. \ . 

Stimulus-base d accounts . As examples of stimulu? -based accounts, consid- 
er two that are motivated by the fact that (as is typically the^'case) the 
isolated-vowels stimuli tended to exhibit relative, spectral^ constancy over 
their course (only the vowels /o/ and /u/ vere' noticeably diphthongized), 
while the vowels in /dVd/ context tended to be marked by more or lets continu- 
ous formant fre4}uency change. 

One reason why formant change may have been the source of * the enhanced 
linguistic processing for^ vowels is that 'its presence or absence may^have 
differentially affected the duration of a vo;#el*s representation in what have 
been called auditory and i>honetlc memory stores. By hypothesis, the former 
preserves ^a relatively unprocessed **n«ural analog** or the acoustic signal and 
the latter preserves features of t he < input that are specifically relevant to 
speech. Fujisaki and Kawashima (1969, ^970; see also Pisoni, 1973) have 
pointed to the differential presence of vowels and consonants in these 
memories as a potential psychoacousti'c basis for the observation that the 
former (particularly isolated vowels) tend to be less categorically perceived 
than the latter (Fr>v Abramson* Eimas, &. LiberiLan, Pisoni, 1973). The 

present argument would simply extend this reasoning to a perceptual difference 
that holds between two classes of vowels, those in and out of consonantal con- 
text. 

# , An alternative reason why forAant change might be expected to engage 
linguistic processing particularly effectively is that properties of the 
speech signal are, most generally, dynamic in character (Libermanv 1982). 
This is a consequence of the fact that the several segments of an utterance 
tdnd to iiiq)ose coiq)eting demands on the artiuuXators* making it necessary for 
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tilkera to interleave their productions In a aerlea of rapid art «cttlatoryVK«3* 
turea. By the lawa of phyalcal acouatica» theae gexturea reault In ^n aaaort* 
Mnt of dynamic modulations of the signal. Owing ^to thla fact» a apeech^ 
percelver sight be expected to be particularly attuned to any aprt of acouatic 
change. 

Though other atiibaluaH9a»<i accounts oiight be advanced* it ahould be ap-* 
parent froai Juat theae that the focus of all auch accounta uill be .6n aome 
acouatic property or propertiea of the atlwlua aet. 

Bow. to diatinguish between the two claaaea of accounta. An eaaentlal 



the degree to which they reflect a aenaitlvity to , propertiea of the atioula- 
tion. ^ Since knowledgeHaaaed factors turn on long-^tern experience .with stiauli 
of a typet they ahould be relatively little affected by Jthe iraedlate experi- 
ence obtained through any particulir encounter with a -atiaulua. Stinu* 
lusHaaaed factory* hy contraat> are expreaaly defined in terms of atimt^us 
propertiea. It should be poaaible> then> to gain acme evidence aa to the 
^locua** of the conaonaiital influence observed here by looking at a cMe in 
which the relative contribution of fmediate atlmulation , la reduced. If 
knoMledge-taaed factora were critical to the preaent reault they ahould be 
expecteJ ^o be manifeat there aa ^well. If» lnatead» atimulua-based factora 
were moat liq>cnant here» the conaonantal influence should be diminiahed. 
Experiment 2 provi^tea a caae relevant to theae predictiona. 



In thla experiment » aubjecta' memorlea for vcvel aounda were examined 
with a procedure analcgbua to that employed in Experiment 1 . The aubjecta 
were aaked to imagine vo«iela~aa occurring in isolation ov In^ V^Vd/ con-* 
text-w^nd to make Judgmenta about the lin^latic relationahlpa among the im*- 
agea. It waa expect ed» flrat of all* that an analySia of these Judgments 
might help to clarify the reaulta of Experiment 1. There it waa found that 
the preaence. of conaonantal context had the effect of evoking ao«ewhat wore 
lingMlstic perceptual pn)ceaaing of vowela ^.han occu* ed in ita absence and it 
was concluded that vhi'ie a number ot different accounta ,of thla effect could 
be put fortht broadly apeaking» theae either turned on various properties of 
the atliuli themselvea or on propertiea having more to do with the occurrence 
and recurrence of vowela aa meaningful categories in English. If theae lat- 
ter » more know led ge*baaed fa^^ora are the critical onea» then it might be 
expected that the presence or abaence of conaonantal context would affect the 
outcome of this memory experiment no leas than it did that of the perception. 
, experiment^, becauae vowel uaage ruleat orthographic regMlarltiea of vow*l 
tr«n;icrlption> and ao on remain in force here. On the other hand» to the de* 
gree that stlmulua propertiea are critical to the effect, the condition 
difference should b« reduced here (or perhaps eliminated) alnce vowel memory 
la at acme remove from the acoustic atlmulation. 

The reaulta of this aecond experiment could alao prove uMful in a aecond 
way:^^ they could point to the dlmenaiona of organization for subjecta' 
long*term memorial repre:;entatlona of vowels. A queatior $rlaes» for 
'instance^ about the nuartwr of such diroenaions. Are there three as in Experi- 
ment 1* and if ao, are the^e the aalie three linguiaticallyHDeaningful dimen-* 
aiona that were found to have perc3pt;ual import? And it JEay be aaked whether 
the a«iie dlmenaiona are utilized in the aame way by different aubjecta. 



difference t etween 




acco'inta haa to do with 



Experiment 2 



102 



103 



11 

Rakerd: Vowels in Consonantal Context Are Perceived More Linguistically 



Method 

StlfflUi. The >3tlinull for Eiperlment 2 were wrltt-en analogs of the spoken 
stlBull used in Eip*^ Iment 1. - That lii to say, they were orthographlo 
representations of t^th Isolated vowels and vowels In a trls/llabtc frame in 
which the medial syllable was stressed /dVd/. The stimulus set cojnprlsed the 
saine.ten vowels that were used In the perception experiment. T^l^ 2 presents 
a suniary of all stimuli. 



Table 2 ' 

Stlflwll for Experiment 2. IV s isolated vowels, /dVd/ = vowels In consonantal 
.context. 



" SpelUngfl " English Exemplars 



Vowel 


IV 


/dVd/ 




2 


3 


1 


EE 


ADEEDA 


eat 


heel 


brief 


I 


IH 


ADIHDA 


it 


hlB 


brim 


c 


EH 


ADEHDA 


egg 


hen 


bread 


as 


AE 


A0AEDA 


at 


ham 


brash 


A 


UH 


ADUHDA 


up 


hull 


brush 


a 


AH 


ADAHDA 


odd 


hop 


bronze 


0 


AW 


ADAWDA 


ought 


haul 


brawn 


0 


OH 


AIK»1DA 


oat 


home 


broach 


y 


UU 


ADUUDA 


ooroph 


hood 


brook 


u 


00 


ADOODA 


ooze 


hoop 


broom 



IT ' 

A 

In English orthography there are numerous ambiguities with respect to the 
spelling of vowel sounds. The letters "00," for exaaiple, stand for the vowel 
/u/ In the word "tool" and jfor /u/ in the word "book." There are Indications 
that these spelling ambiguities can affect listeners* perceptions of vowels 
(Assmann ot al., 1982) and» while the present experiment was not strictly 
perceptual. It was thought advisable to devise vowel spellings that were 
^ Unique to each sound. These are presented in the second and third columns of 

Table 2^ In all cases the vowels were spelled with two-letter sequences. 
These sei}uences were presented alone for Isolated vowels and embedded in the 

frame AD ^DA for the trisyllables. In the latter case subjects were told to 

read each stimulus as a^three-syllable nonsense word, the first and last syll- 
ables of which consisted^^ unstressed schwa (/^/) vowels and the middle syll- 
able of which was a stressed /dVd/. 
i * I ^ ^ ^ 

Subjects were familiarlred with the new 'orthography with the aid of a 
training sequence. This sequence paired each written vowel form with three 
English monosyllabic words containing the vowel sot^nd that the form was meant 
to represent (see Tabic 2). These words were selected so as to be similar to, 
but distinct from, both the isolated and /dVd/ contexts employed in the 
experimental test. 
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The test series consisted of triads of stimuli presented In three adja- 
cent columns. All possible trladlc concblnatlons of the vowels were Included 
In each series. The order of occurrence of triads was randomlzed» as was the 
assignment of the words of each triad to the columns. 



Instructions . As nearly ss possible^ the Instructions for Experiment 2 
paralleled those for Experiment 1. It wa^ explained to subjects that their 
task would be to Imagine a nunuer of different vowel sounds a.id to make 
linguistic comparisons* of the Images, k sense of what It would n^an to make 
linguistic comparisons was again provided by the example of distinguishing an 
adult *s and a child *s productions of the vowel /I/ from their productions of 
vowels such as A/ and /i/ (see the Instructions section of E::perlment 1). 

The trladlc comparlsotis testing proci^ure was explained In detail. Sub- 
jects were told that th<y would be given a test series (either the Isolat- 
ed-vowels series or the consonantal-context series) and a cover sheet. The 
cover sheet had a small silt cut In It to allow the viewing of only one test 
line (a trial) at a time. The procedure was: (1) to move the cover sheet 
down the test page, thereby exposing the three stimuli of a trial; (2) to 
make a trladlc comparison among the Images of the three vowels represented on 
the line; (3) to write down the colunn numbers of the most-alike and 
least-alike vowel pairs; and (4) to proceed to the next trial. It was empha- 
sized to subjects that they were under no time pressure. To the contrary* 
they were Instructed to proceed at whatever pace they found comfortable» with 
the constraint that they not look back at any trial once It had been comi^let^ 
ed. 

Orthographic training and administration of b pretest . Prior to the 
test^ subjects were given eaJtenslve work with the orthographic training se- 
quence. The experimenter first read the sequence aloud» pointing out poten^ 
tlal errors to be avoided. Next» subjects were allowed to ask questions about 
the spellings and then the sequmce was read aloud a second tlfbe. Finally^ 
the subjects were told to study the sequence on their own for as long as was 
needed to commit the spellings to memory. 

At the end of the individual study sessions and before the actual test 
series were presented* the subjects were asked to coo9>lete a pretest designed 
to assess competency with the new orthography. The pretest waa^tralght for- 
wards Subjects were presented a randomized list of written vowel stimuli and 
asked to give three exaiq>les of English words that contained the- vowels 
Indicated. The exaiiv>les they gave had to be different from those used In the 
training sequence. Subjects* test results were omitted from the data analysis 
If they made more than one error on the pretest. 

Subjects . Thirty-three undergraduates^ enrolled In an Introductory 
psychology course at the University of Connecticut » participated In the 
experiment for course credit. These Individuals were native English speakers. 
They had no prior knowledge of either the purpose of the experiment or Its de- 
sign. On the basis of their performance on the pretest* six subjects were 
eliminated. Twelve of the remaining 27 subjects were In the Isolated-vowels 
condition^ 15 were In the consonantal-context condition. 
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Analysis of the Data 

Since Experiment 2 was designed to test subjects* memories for vowels in 
a way that paralleled the perception test of Experiment 1» there was reason to 
expect that the most appropriate scaling solution for the present data might 
be the three-diwnsional one that app<;ared earlier. After variou.s modeling 
alternatives were examined^ it was concluded that» with ,one methodological 
exception to be noted below* this was in fact the case. 

Dimenjcionalit y of the space . Thp percentage of variance accounted for 
was computed as a function of the number of modeling dimensions. This func- 
tion had roughly the same shape as its counterpart in Experiment i» an 
observation consistent with the expectation that a three-dimensional modeling 
of these data^might prove as appropriate here as it did in Experiment 1. The 
VAF comparison with Experiment 1 also showed that at each dimension level 
these memory data were somewha t les5 well fit by the model than were the 
perception data» which is to 53ytRffe**^^e data were somewhat '*noisier'* here. 
This is not surprising* In the perceptiofr^gst* the iteijp presented to sub- 
jects were highly familiar , (spoken English^toiels ) and were* in fact» the 
perceptual objects of interest. Here» by contr^st» the items presented were 
rather unfamiliar vowel spellings* which only mediated contact with the memory 
images that were th^* true objects of study. 

Wonmetric scaling . It has been pointed out that the stability of a 
modeling outcome must be considered when making decisions about scaling (see 
Wish & Carroll* 197^* for a discussion of this point). With respect to the 
pre3ent study, this consideration bore most directly on the decision to per- 
form nonmetric individual-differences scaling* as against a more commonly used 
metric procedure such as INDSCAL (Carroll & Chang; 1970). For certain of the 
analyses of Experiment 1 in particular* the metric/ nonmetric modeling distinc- 
tion made little or no difference in the outcome. However* this could not be 
said to be the case in Experiment 2; modeling these data at the metric 5cale 
resulted in an uninterpretable group space. At the nonmetric scale* on the 
other hand, the group space was not only interpretable* but vas quite evident- 
ly related to the group space of Experiment 1. 

This perception/meokory difference in what might be called **measurement . 
level** may be interesting in its own ri^t. It suggests that the memory space 
for vowels is a sort of nonlinearly transformed version of the perceptual 
space. Interval relationships among the vowels hold in perceptual space but 
not in memory. On the other hand* the relatively noisy character of the memo- 
ry data has already been noted, and the **measurement level** difference bet^reen 
the perception and meoiory experiments may siiq>ly reflect task variables. 
Whatever the true state of affairs* the approach takai in this study has been 
to model all data at the more conservative nonmetric level. 

Starting configuration . It proved to be the case that the three dimen- 
sions of the group space could not be interpreted as originally modeled. They 
neither corresponded to the linguistic features of advancement* height* and 
tenseness as they had in Experiment 1* nor to other recognized feati^res of 
articulatory or acoustic description for vowels. This was equally the case 
for the two- and four-dimensional group spaces. 
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The scaling procedure was therefore rerun in three dlmet^lons with the 
group space of Experiment 1 taken as a starting configuration . This was» In 
effect^ a test of the appropriateness of that earlier* group space as a model 
for memory data structure. It did prove to be an appropriate models as evi- 
denced by the fact that It fit the memory data nearly as well as had the 
original* unlnterpretable* three-dimensional solution (mean VAF was 59% with 
the starting conf lguratlon» 6lJ without It). 

Results 

Group space . The group space for all subjects who participated In the 
two conditions of the memory experiment Is shown In Figure 5. It Is quite 
evidently similar to the group space for Experiment 1 (Figure 2). In the di- 
mension 2-by-dlmenslon 1 plane* only the vowel /a/ has shifted its position 
substantially: it Is "higher" and mare "fronted" In the present analj-sls. In 
the dimension 3-by-dlmenslon 1 plane, the only vowels that moved noticeably 
are /ae/ and /q/. The former can be seen to have taken on a dimension 3 value 
that Is somewhat more '•tense*** the latter one Is more '•lax.*' These shifts do 
not ^substantially alter the overall configuration* however. On the whole* 
then* It can be said that this combined group space for the memory experiment 
doec not differ substantially from that for the perception experiment. In 
both cases, the non-*arbltrary axes of the space correspond to the linguistic ' 
features of advancement, height* and tenseness. 

Weight space . Since the group Spaces are similar for Experiments 1 and 
2, It Is Interesting to see how the weight spaces compare. Indeed* a primary 
motivation for carrying out Experiment 2 was to determine whether the oondl-* 
tlon difference In dimension weightings seen for perception would be manifest 
In memory as well. A look at Figure 6^ which displays the combined weight 
space for Experiment 2, Indicates that It^ was not. 

In Experiment 1» subjects In the consonantal-context condition were con- 
sistent with one another in attaching substantial weight to all three 
linguistically-meaningful dimensions of the group space* while isolated -vowels 
subjects were quite variable, with different Individuals weighting different 
dimensions disproportionately. Here* by contrast* subjects In both conditions 
behaved In a fairly cop^arable way: they clustered toward the center of the 
weight space (roughly as did the consonantal -^ontexjt subjects of Experiment 
1). It turned out that Isolated-vowels subjects were* If anything* less 
variable in exhibiting this pattern than were their consonantal-context 
counterparts-^the opposite result from that observed In Experiment 1. (This 
trend was not significant in the original weight space shown In Figure 6* 
F(U,11) « 2.27), but was In a weight space adjusted bo compensate for good- 
ness-of-flt differences among subjects (cf. Experiment 1* F(14^ll) 4.30, 

Clearly, the pattern of dimension weightings obtained for memory Judg- 
ments made at some remove from the acoustic stimulation Is substantially dif- 
ferent from.thj^t obtained in perception. This strongly suggests that stimu- 
lus-based factors were critical to the perceptual Influence of consonants that 
was observed lo Experiment 1. 
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Discussion 

In Experiment 1, It was concluded that /dVd/ context had the effect of 
evoking more linguistic perceptual processing of vowels than occurred'ln Iso- 
lation. There are a number of knowledge-based accounts of why this might have 
been the case, including the facts that vowels more frequently occur in 
consonantal context than o^t, that certain phonological rules are specific to 
consonantal contexts in English, and that regularities (and irregularities) in 
English orthographic representations of vowels may differ with context. Since 
these knowledge-based factors reflect a history of experience with vowels as 
meaningful categories in English, it might be expected that they woultl^have an 
Influence in this vowel memory experliLent as well. However, the variant ana^ 
lyses of the subject weights indicate that they did not. It can be at I^st 
tentatively concluded, therefore, that the consonantal Influence in perception 
had more to do with stlmulus*based factors than with knowledge*based factors. 

In Experiment 1, a close correspondence was observed between three ^fea- 
tures of linguistic description for vowels (advancement, height, and tense- 
ness) and the three dimensions of the group space. The fact that Indlvldu^ 
al-K]lf ferences scaling showed these to be the dimensions that optimally 
accounted for variance in the several subjects* data was taken ^s particularly 
strong evidence that those linguistic features have some, significance for the 
perception of vowels. A related point can now be a^de with respect to vowel 
mmory, although it must be soinewhat tempered by the reservation that the pre- 
sent analysis was initiated by a starting configuration. The orientation of 
axos for the resulting, group space was nevertheless dictated by the character 
of Individual^ Object data, and the observed correspondence between the 
linguistic features and the dimensions of this space strongly suggests that 
listeners' memories for vowels are, at least in dome measure, organl^zed in a 
way that respects those features. Thus, there appears to be a consistent 
recurrence of the features in perception, ^mory, and in linguistic behavioral 
data such as those having to do with grammatical rules for vowel usage. 

It is Important to recognize that altogether different results might have 
obtained. First of all, the several subjects participating in this memory 
experiment might have exhibited no consistent pattern of responding at all, in 
which case the piodel would hav^ failed to account for a reasonable percentage 
of the variance in- the data and the dimensions of the group space would have 
been unlnterpretable. Alternatively, to the degree that subjects behaved con~ 
slstently, they might have done so in a way that made little or no sense from 
a linguistic standpoint. Since the stimuli of this experiment were presented 
by eye, subjects might, for example, have made their Judgments on the basis of 
visual features of the input, but they did not. 

!^ummary and Conclusions 

This study was motivated by an interest in the <luestlon of whether vowel 
perception is greatly Influenced by the consonantal context in which a vowel 
occurs. A good deal is known about the perception (and production) of isolat- 
ed vowels, and an answer to this question of consonantal Influence will deter- 
mine how researchf'rs generalize from that knowledge base. To the degree that 
the Influence on perception is minor the isolated vowel form. might reasonably 
be viewed as canonical (since it is unencumbered by any context effects at 
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all) and its acoustic signature might be taken to be conqposed of the essential 
Information for vowel perception. On the other hand> If consonantal context 
la found to affect the perception of vowels slgnlf Icantly, then the Isolated 
form can only be considered to be one variant of the vowel and, given the 
Infrequency of Its occurrence In Natural speech, an arguably unrepresentative 
variant. Caution would therefore be required In generalizing from what is 
known about it. * 

The results of the present study clearly support the latter position. 
Vowels were here found to be perceived ^significantly differently In consonan- 
tal Context than they were In Isolation. One aspect of that difference was 
that listeners exhibited greater agreement with one another about the linguis- 
tic relationships that held among a set of vowels when tho5e vowels were In 
/dVd/ context than when they were In Isolation. A second aspect was that with 
.Isolated vovels lls};eners attended In a plecewlse manner to three different 
vowel dimensions, while with vowels In context they Integrated those dimen- 
sions In a way that was more consistent with other aspects of linguistic be- 
havior. 

These findings have been Interpreted as Indicating that /dVd/ context had 
the effect of eliciting more linguistic perceptual proc3Sslng of vowels than 
occurred when they were presented In Isolation. To the degree that this 
Interpretation In appropriate^ It follows that those who do linguistic re- 
search on vowels In the future would do well to examine th«^m In some consonan- 
tal context. 
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Footnotes 

Vhe ^appropriateness^ of point positioning has to do with the distances 
between the points. Those distances should, as nearly as possible* be ordered 
in a manner that reflects order in the perceptual data (see also Footnote 9)* 

^ith other scaling methods, sucH as those designed for the analysis of 
single matrices of data (e,g,* Shepard* l962a* l962b; Cuttman* 1968)* it is 
necessary to perform a post t\OQ rotation of the scaling solution in order to 
bring it into any sort of interpretable orientation. The particular rotation 
performed is necessarily shaped by an investigator's intuitions about thr ap- 
propriate dimensions of interpretation and is* correspondingly* vulnerable to 
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^the challenge that some other dimensions would have been equally (or more) ap- 
propriate had some other rotation been carrrled out. It Is ju^t this post hoc 
rotation that Is precluded with" Individual differences scaling (Carroll & 
Change 1970; Wish & Carroll^ 197^). 

3 

Since this Is tti most commonly discussed index to flt» it Is the one 
that will be considered here. Nevertheless^ It should be noted that the data' 
were in Oct scaled wlt)i a procedure (ALSCAL^ designed by Takane ,et al.^ 1977) 
that» for coQ%/utatlonai reasons^ optimizes a related but slightly different 
Index called SSTDESS. This undoubtedly accounts for the alight decrement In 
the VAF function seen at five dimensions (there was no such decrement In th^ 
SSTDESS function). Solutions obtained by' optimizing SSTDESS are extremely 
similar to those obtained with alternative individual differences scaling 
methods (Takane et al.» 1977). ' ■ ■ * 

■ ■ ■ ■ ' - ' >uii, 



VAF might not Increase wltn Qn Increase in dimensionality If a scaling 
.algorithm was habited after a fixed number of Iterations or» more commonly^ If 
It was halted due to the encounter, of a "local minimum" in the optimizing 
function (see also Footnote 3)* ^ 
5 

In English^ the tenseness and length feature labels are not quite so 
Interchangeable as are» say* height and isotnpactness. The "tense" vowels are 
generally "long*^^ ^vow61s as well» but there Is one notable exception: the 
vowel /^e/. This vowel Is phonologlcally "long»" yet a usage rule treats It as 
"lax"' In that It cannot appear in open position. With respect to the group 
space» /^^ Is 'likewise grouped wlih -the "lax" vowels along dimension i* which 
makes. the choice of the tenseness label particularly appropriate for this di- 
mension. ' * 

She Pythagorean theorem holds that the distance between two oolnts In a 
three-dimensional space will be equal to the square root of the :^m of the 
squared distances between those points* coordinates along the three reference 
axes. Hence^ the distance between subject 1 (Indexed by the coordinates x^, 
y^. z^) and the centroljd" for all subjects in the same condition (Indexed by 
x^» >V« computed with the equation: 

distance = ((x^-x )? + (y--y )^ + (z--z )^)^^'^ 

( C I C f c 
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It Is al50 noteworthy that when the two conditions of the experiment 
w^e modeled separately ^ thl^ significant difference In subject variability 
remained (see Dakerd^*, 1982^- for the analysU). 

d ' ' ' ■ 

In Experiment 1« the group atid weight spaces^ for the metric analysis of 

the combined data were virtually Identical to those for the nonmetrlc analysis 

.shown In Figures 2 and 3. When the conditions were modeled separately (Da^' 

kerd, 1982)^ the metric and nonmetrlc /solutions did differ, at least In de- 
tail. " ' 

^To get a sense of what employing a starting configuration entails^ It Is 
Important to understand how the s^.allng procedure operates more generally. An 
optimal fit to a set of data Is ap^ileved by successively adjusting the stimu- 
lus configuration over a series of Iterations. The scaling procedure halts 
when the improvement acTileved qn any given Iteration 1? less than some speci- 
fied tolerance value. The adjustment that Is made to the. configuration 
J amounts to moving the 'Individual stimuli around in the group space in a w&y 
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that Is sensitive to the modeling shortcomings jof the existing configuration* 
Iff for exaiq)le» the vowels, /l»t»&/ were currently positioned In the space 
sucK that the distances among them were ordered as follows; 

and yet most subjects ranked vowel-pair similarity such Chat /l*i/ was judged 
less similar than /i^t/^ then on the following Iteration of the procedure 
there woUld be a shift In the positioning of /t/ to correct the mismatch be* 
tween model and data, l,e*f 

/II » / 

Since the scaling procedure Is capable of making such adjustments. It Is 
possible to start with a truly random stimulus configuration tmd gradually^ 
over trials, to. move to one t^at fits a dat« set quite well* It Is equally 
possible* however, to start with a configuration that» for a priori reasons^ 
mtgh^ be expected to fit the data closely from the outset*^ To the degre* that 
It coes, then the procedure will make only minor linprovements and will halt In 
a relitlvely*small number of Iterations (because those Improvements wlll^be 
less than the halting tolerance level)* Owing to this feature^ It Is possl* 
ble. In ^ffecti to test out the appropriateness ot a particular starting 
configuration for the Individual differences modeling of any set of data* 

^^Slnce scaling accounted for a relatively smaller percentage of the 
variance In the memory data than It accounted for In the perception data 
<Etperlment 1), all weights are, on average, smaller here (l.e*» closer to the 
origin of the weight space)* 
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CHILDREN'S PERCEPTION OF [3] AND [J]; THE RELATION BETWEEN ARTICULATION AND 
PERCEPTUAL ADJUSTHENT FO?. roARTiCULATORY EFFECTS 



Virginia A. Kanni^ Harriet K. Sharlin^^* and Michael Dorman^^^ 



Abstract. When synthetic fricative noises from an CJl-Cs] continuum 
are followed by [a] and,^ [u]» adult listeners perceive f«wer in- 
stances of tn in the context of Cu] (Mann. & Repp» 1980). This 
perceptual context effect presunably reflects adJustiAent for the 
coarticulatory effects of roJnded vowels on preceding fricatives^ 
and thus iof>lies possession of tacit knowle<f^d of this' coarticula- 
tion anc> its consequences. To deternine the role of articulatory 
experience in the Qntogeny of such knowlecige and the consequent 
perceptual, adjustment » the present study examined the effect of[ei] 
and tu] on the perception of [s] and CH by children who can and 
cannot produce these consonants. The stimuli coiqprised synthetic 
frication noises from an CH to {s} continuum adjoined to periodic 
portions excerpted from natural tokens of ''shave*' and *'shoe.*' The 
subjects included adults, five- and seven^year^ld children wh& 
correctly produce both tH and ts], and seven-year^ld children who 
misarticjlate both fricatives. Ail three groups of children showed 
a significant context effect equivalent to that of adults and inde- 
pendent of age and tjtie fricative articulation. Therefore^ 
productive mastery of [s] and [J] is not responsible^ for children's 
perceptual adjustment to vowel rounding on the spectra of voiceless 
fricatives. 



Among adjlt subjects^ context effects in the perception of spoken conso^ 
ntnts are a well-established phenomenon ^(see Repp» 1982» for a recent review). 
One acoustic pattern may support different phonetic interpretations in differ- 
ent environments. Exampli&s of such effects can be found in the perception of 
bursts as cues for stop consonant place of articulation (Liberman* Delattre, & 
Cooper^ 1952)i and in the perception of formant transitions as cues to consof 
nant place (Mann^ 1980; Hann 4 Pepp, 1981) and manner (Miller 4 Liberman^ 
1979). Another exaniple» and the one that concerns us here» involves the place 
of articulation of voiceless fricative noises: When a synthetic fricative 
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notw aoblgMous between [J] aod [sJ precedes the vowel listeners perceive 

[J] less often than' when the sane ne^lse precedes the vowel CaJ (Fujl^akl i Ku- 
nisakl, 1978; Hann i Repp» 198C). 

Like a aiyrlad of other context effects In 3^*tech perceptloo» the con- 
trasting effect of Cu] and [a] on perception of a preceding fricative noise 
finds a parallelf and a plausible explanation^ In the dynatfaica of artlculatory 
gestures and their acoustic consequences. Hhp parallel Is that» due to 
coartlculatlon of ad.lacent phoneM3» when (sf] *and [s] precede a roundtd vbwel» 
such as the English {u]» they are Influenced by anticipatory llproundlng. The 
effect Is a lowering of fncatlve noise spectra relative to that which odours ^ 
when [J] and tsl are pro<!uced befortt an unround^ vowel* such as the^English 
[a] (BondarkOt ^ 1969; Heinz i Stevens^ 1961; Hann i Repp* 1980). The expla* 
nation Is that» since Cs] nolses» in general^ Involve higher spectral frequent 
cles than tSl noises* any cooponsation for the consequences of llproundlng 
during frlcatlve%produotion would nake a given A>lse appear rela;;<vely higher 
when It occurs before a rounded vowel » thus decreasing the likelihood that [J] 
will be perceived. ' . >f 

Therefore*^ the tendency of adult listeners to give fewer [J] re4«)onses 
when synthetic fricative noises occur in the context of CuJ is Interpreted as 
the^ reflection of a tendency to coap<Hisate for the apoustjc consequences of 
^anticipatory llproundlng on fricative noise spectra (Hann i Repp* i960). That 
'^they so taklft aocount of the acoustic consequences of. artlculatory dynanics as 
they assign phonetic labels to speech stlQull is not a unique attribute of. 
fricative perception^ but would stem to be a nore general md fundamental 
property of perception in the speech' mode. It is as if speech perception is ^ 
guided by some tacit knowledge of the diverse acoustic consequences of artlcu- 
latory gestures (Repp» Llberfflan» Eccardt» i Pesetsky, 1978)» and of the subtle 
changes that necessarily ensue when sequences of such gestures weave and over* 
lap in fluent speech (Hann, i960; Hann i Repp» l98l). The basis of such 
knowledge » hovever^ remains unciear^ as does its role in young children ^1 
speech perception.' To gain' insight into these i^ues* the present study has 
explored^ the effects of [ei] and [u] on the perceimd:) of the [I]-[s] dlstlnc*^ 
tlon among clilldren whQ can produce ifl} and ttid those who cannot. 

It is possible that ^cit knowledge about the articulation of a given 
phoneme* wid its diverse acoustic i^onsequences» is gathered from listening to 
one^s own production of that phoneM. If so» experience rith the articulation 
of [s] and [X] might be critical to any artlculatory knowledge that allcvs the 
child to coi^ensate for the effects of Jti; rounding on fricative noise spectra. 
This hypothesis would be verifHd were we to' find the normal contrasting ef* 
fe^3 of [u3 and L^^l only in toe perception of fricatives by children uho can^ 
produce Cs] and and not in that o^ childt^en who have yet to produce 

these phonemes. 

On the, other hand^ it is likewise possible that children r^ho cannot pro- 
duce Cs] and [X] could iionetheless be Just as capa^e (or Inc^pabXev as the 
case may be) of percepffielly adjusting for the Inrluence of llproundlng on 
fricative noise spectra. On finding this to be the case» we could reject a 
hypothesis that correct fricative articulation is essential to knowledge about 
the consequences of fricative-vowel coartlculatlon» and then turn to' consider- 
ing three alternative bases of that knowledge. Firsts any tacit knowledge 
underlying the effect of vocalic context on fricative perception might be 
instantiated by more general experience with one^s own articulation as opposed 
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to speclClc experience vlth fricative articulation. Second* it could be 
brought about by experience vlth hetirlng and seeing the speech of others. 
Third* given the many findings that at least some knovl^ge about the acoustic 
consequences of articulation could be Inborn (Kuhl & Heltzoff* 7982; Miller & 
Elmas* In press)* the ontogeny of tacit artlculatory knowledge could be large- 
ly under genetic control* and relatively Independent of specific experience, 
barring the necessary role of stlnulatlon In the emergence of genetic behav- 
iors. 



A revlev of the literature reveals that* vhlle there are many studies of 
the ontogeny of speech perception and production* nuch remains to be learned 
about fricative perception* and Its relation to fricative production. 
Prellngual infants have been reported to be capable of discriminating synthet- 
ic tokens of [set] and [Set] (Ellers, 1980; Ellers 4 Mlnlfee* 1975) and 
six-month-old Infants may distinguish natural tokens of [s] and Ilj In the 
context of [a] and [u] (Kuhl* 1980). Yet vhen [s] and [j] Initiate natu'^al 
CVC syllables* children aged ten to eighteen months may fal> to make a 
perceptual distinction (Garnlca, 1971; Shvachkln* 1973)" and children as old 
as five years of age may show confusions among natural tokens of Cs] and other 
fricative consonants (Abbs & Mlnlfee* 1969). Likewise, although there are re- 
ports that children as young as two or three years old may .correctly produce ^ 
[s] and [J] (Prather* Hedrlck* & Kern, 1975)* there Is nuch evidence that 
fricatives ar? produced relatively late In language development* and that 
fricative mlsartlculatlon can be present well into the early elementary grades 
(Moskowltz* 1975J with considerable Individual variability (Ingram* Christen- 
sen* V^ach, & Webster* 1980). In short* it is unclear exactly when the 
[s]-[X] distinction Is mastered either In perception or production* nur Is the 
t^elatlon between the two abilities apparent. On the ba31s of the common 
c^servatlon that development of language comprehension" precedes that of lan- 
guage production, It might be tempting to discard a hypothesis that mature 
production of the [X]-[s] distinction Is essential to mature perception of 
that distinction. Nonetheless, there are no reports that falsify this hypoth- 
esis* nor has a subtle and sensitive assessment of children *s perception cf 
fricatives been undertaken, such as might be supplied through a study using 
context effects. 

With these considerations In mind, we conducted two ejtperlments* each 
concerned with the contrasting Influence of [a] and [u] on young chll<jren*s 
perception of the [X]-[sj distinction. Our methodology is drawn from that of 
Mann and Repp (1980), ein>loylng a continuum of synthetic fricative noises 
(ranging from one appropriate to [X] to one appropriate to [s3) that were fol- 
lowed by vocalic portions from natural syllables containing the vowel [eij or 
[u]. Their adult subjects were required W label the Initial fricative of 
each syllable as [J] or [s]* and the context effect was measured In terms of 
the number of IS J responses given In the context of each vowel. In Experiment 
1, we adapt Mann and Repp*s materials and their phoneme labeling task to a^ 
forced-choice picture Identification task suUable for use with prellterate 
children* and we provide a test of these adaptations among a population of 
five- and seven-year-old children who have mastered production of ts] and CJ]. 
Thus we demonstrate the utility of our procedure and discern whether any 
marked changes In vocalic context effects occur following the mastery uf 
fricative production. In Experiment 2, we turn to a second population of se- 
ven-year-old children who are In speech therapy because they have not mastered 
production of [s] and [J]. In this case* our goal 1^ to discern whether vo* 
callc context effects are present before fricative articulation Is fully mas* 
tered. 
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Experiment 1 

Method 

Subjects . All subjects were native speakers of English who had no prior 
experience with synthetic speech. Adults were recruited from the Bryn Hawr 
area and children were recruited from a local day-care center: none of them 
had any known organic* behavioral* enDtlonal, or Intellectual problems. In 
order to be considered as a potential subject^ each adiHt had to report no 
known hearing ^or speech pathologies. Each child had to have norcnal hearing 
acuity as determined by preschool screening and to be able to produce correct- 
ly the [s] and [X] in **sue^** *^shoe»** **save,** and **shave.** ChO»J». accoiMtijig to 
these criteria* there were ten subjects «t each of three age levels, ^Irt Experi- 
ment 1! five-year-olds (mean age 5.6 years), seven-year-olds (meian $ge 7.5 
years)* and adults (mean age 22.4 years). 

^. 

Materials > The stlinull were hybrid syllables consisting of synthetic 
fricative noises followed by natural vocalic portions to form two rX]-[s] con- 
tlnua! •*shoe**-**Sue** and **shave**-**save. ** To construct them* we ^ began with 
recordings of the words **shoe** and **shav<t** that had been redd aloud by a na- 
tive male speaker of American English as part of a list of word5 containing 
Initial volceleA fricatives. All utterances were digitized sJt 10 kHz u±>lng 
the Hasklns Laboratories Pulse Code Modulation (PCM) system* ^and the single 
best tokens of •'shoe** and **shave** were chosen for further use* The fricative 
noise was then removed from each of these (the fricative noise being defined 
as the signal portion preceding the onset of periodicity)* and replaced* In 
turn* with each of nine digitized synthetic fricative noises created on the 
Hasklns Laboratories pVE IIIc speech synthesizer. The synthetic noises were 
characterized by two steady-state poles whose center frequencies^ as can be 
seen In Table 1* Increased In eight approximately equal steps from Stlnulus 1* 
which approximated a natural CX]* to Stlnulus 9* which approximated a natural 
[s]. Noise duration was held constant at 250 ms* with a 150 ms Initial ampli- 
tude rise* 'and a 30 ma final amplitude fall. 



Table 1 

Pole Frequencies of Fricative Noises (Hz) 



Stimulus Pole 1_ Pole 2 

1 1957 3803 

2 2197 3915 

3 . 2^66 i*U8 

4 2690 4269 

5 2933 4394 

6 3199 4655 

7 3389 **792 

8 3591 **932 

9 3917 5077 
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For the purpose of testing perception of the test stltoulli two different 
masnetlc tapes were prepared, a separate one for each stimulus continuum. 
Each tape consisted of a practice set comprising five tokens of each of the 
two endpolnt stimuli arranged In a random order, - followed by a test set 
cojnprlslng a randomized sequence that included five repetitions of each of the 
nine test stimuli along the continuum. Interstlmulus interval was held con- 
; stant at 5 s^c. 



All teJtlng was conducted Individually at the residence (for adults) or 
daycare cennr (for children) where the subject was solicited. Each subject 
listened to stimuli over clrcum&ural earphones at a, presentation level of 
approximately 70 dB SPt. Both tapes were cotif)leted within a single session, 
with the 6rder of presentation counterbalanced across subjects. for each 
tape* the ten Items In the practice set were presented first , followed by 
presentation of the 45 test Items. The, procedure involved the subject**! 
listening to each stimulus and then reporting his or her phonetic perception. 
Whereas adults gave written responses of **s** or **sht** as in the procedure of 
Nann eni Repp (1980), children gave a two-alternative forced^holce pointing 
responses to pictures that corresponded to the words on the tape~**a shoe* 
vs. **a girl named Sue ** for the tu] context, •»a toan having a Shave** vs. **a pig- 
gy-bank In which to save** for the [ei] context~and their responses were ^ tran* 
scribed by the examiner^ who did not know the Identity of the stimulus being 
presented. To ac'custom children to this task the experimenter showed two 
pictures, **tree** and **blue,** before the test tape was presented and asked the 
child to point to the appropriate picture as she said each word aloud. When 
the child correctly Identified five presentations of each of these two words 
arranged In random order, the task was repeated using pictures for **shoe** and 
**blue.** Finally, the child was shown the pictures for the appropriate experl'- 
mental task and given practice with the experimenter saying each test word 
aloud. When the child had touched each picture correctly on five occasions, 
arranged In random order, presentation of the prerecorded practice and test 
stimuli followed. 

Result s and Discussion 

The data for Experiment 1 consist of labeling responses of *^s** and *^sh*^ 
for stimuli along each of our two experimental contlnua gathered directly from 
adults, and Inferred from chlldren*s picture verification responses. We will 
briefly consider the data obtained with adult subjects, then proceed to a re^- 
port of the results obtained with children at each age, and a brief discussion 
of their xmport. 

Adults. A sunmary of the results obtained with the ten adult subjects 
appears in Figure 1, where the average percent of **sh** responses Is plotted as 
a function of stimulus position along the fricative noise contlnuumi separate^- 
ly for each vocalic context. Solid lines represent the results obtained whai 
fricative noises Initiated a syllable containing the rounded vowel tu], and 
dashed lines represent those obtained when the same noises Initiated the syll'- 
able containing the unrounded vowel [ei]. For both contlnua, listeners were 
quite consistent In their labeling of the endpolnt stimuli. And, as expected, 
the category boundary for the labeling function obtained In the context of the 
unrounded vowel from *'shave** occurs between stimuli 5 and 6 (at 5.2), whereas 
that for the unrounded vowel from **shoe** occurs between stimuli ^ and 5 (at 
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, 4.2). Thus few^r '•ah" responses were given in the context of the rounded 
vowel, t(l8) = 3.1» E < .01. 




» 1 1 f f 1 1 1 — 1 

1 2 34 5 6 789 
STIMULUS NUMBER (FRICATIVE NOISE) 

Y 

Figure 1. Influence of vocalic context on the labeling of fricative noises by 
adult subjectst 

Children . All children successfully learned the procedure, and were 100$ 
correct in identifying the pictures corresponding to spoken versions of the 
test words and 80$ or better correct in responding ^ the practice endpoint 
stiinili. The results for five* and seven-year-olds are graphed in Figures 2 
and 3* respectively. Herei as in the case of the adult subjects* both the 
endpoint stinuli were labeled quite consistently* and here, as well* the cate* 
gory boundaries for the two vocalic contexts lie at different locations. The 
boundary for noises presented in the context of the unrounded vowel lies at 
5.5 for five-year olds* and 5.2 for seven-year-olds* while the boundaries for 
noised heard in the context of the rounded vowel occur at '^•l and 4.3* 
respectively. 

An analysis of variance* conducted on the total nuiriber of '*sh'* responses 
given in each vocalic context by the adults and the children at each of the 
two age levels* reveals a main effect of vocalic context* F(1*27) = 59.4* 
£ < .001, but no main effect of age* and no interaction between the effects of 
age and vocalic context. Thus* all subjectst adults and children alike* tend- 
ed to give fewer '•sh** responses in the context of the rounded vowel; for 
five-year-olds* t(l8)=2.31* £< .05* and for seven-year-olds* t(18) = 3.37^ 
£ < .01. Moreover* when measured as the difference between the number of '•sh" 
responses given in each context* the extent of the context effect among chil- 
dren was not significantly different from that among adults (£ > .1). 
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123456789 
STIMULUS NUMBER (FRICATIVE NOISE) 

Figure 2. Influence of vocalic context on the labeling of fricative noises by 
flve-year^lds who can articulate Cs] and [T]. 
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Figure 3. Influence of vocalic context on the labeling of fricative noises by 
seven -year -olds Who can articulate Cs] and ISI. 
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Using a new set of 3tliDull» then» Experiment 1 has confirmed previous re<^ 
ports (Pujisaki^ & Kunisaki, 1978; M^ain & Repp» 1980) that when ^synthetic 
fricative noises along an [X]-[s] continuum are followed by a vocalic portion 
that contains the vowel [u]» the category ooundary is shifted towards a lower 
noise frequency and fewer **sh** responses » than when the same fricative noises 
are heard in the context of the vowel [ei]. Most ^ iav>ortantly» it has 
demonstrated th^t this'^vttcalic context effect can be present among five- and 
seven -year^ld children who correctly produce [s] and [J]^ and that, among 
such children^ the extent and direction of the effect is remarkably similar to 
that obtained among adults, thus children as young as five years of age who 
can produce both [s] and [J] show an adult-like perceptual condensation for 
the coarticulatory effects oC -JLiprour^iing on the spectra of those fricatives^ 
and we may conclude^ therefore^ that knowledge of fricative-vowel coarticula* 
tion and its acoustio consequences does not markedly lag behind productive 
mastery of [3] an<) [XX. Otherwise^ we should have found an age-related 
difference between the children and adults who participated in our study. 
This leaves us with two possibilities as to the relation between perception 
^' and production; Either perceptual mastery precedes production mastery^ or the 
two begin at tore or less the same time. To decide between these alterna- 
tivest we turn to the second experiment of our study» which asks whether a vo- 
calic context effect is present among children who cannot produce [s] and fX]. 

Experiment 2 

Method 

Subjects . The subjects were fourteen children recruited from -the sec- 
ond-grade classe? or parochial schools In Northeast Philadelphia^ who served 
with the permission of their parents and at the convenience of their teachers. 
Each of them was selected with the help of speech therapists who worked in 
their schools. They fulfilled all of the following criteria: 

1) Incorrect production of initial ts] and/or [X]; either substituting 
one for the other» substituting another phoneme instead^ or simply 
omitting [s] and CX] altogether. 

2) No difficulty with the production of phonemes other than fricatives 
or affricates. 

3) A maximum of one year in speech therapy. 

4) Audiometry scores within the range defined in Experiment 1. 

5) No soft neurological signs» cerebral palsy» emotional or behavioral 
disorders. 



Chosen according to these criteria^ there were six females and eight males» 
with an average age of 7.6 years. 
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Materials and Procedure 

The materials and procedure were as in Experiment 1. Each subject was 
excused from his or her classroom and taken to the speech room in the school, 
where the experimenter explained that the child was helping her to study the 
way children hear language. The subjects were assured that there was no right 
or wrong answer involved^ and that all that was required was to listen care* 
fully; The same procedure as in Experiment 1 was used* with training followed 
by practice! culminating in presentation of the test trials. Order of the 
test tapes was counterbalanced across subjects. 

Results and Discussion 

The data obtained from the seven-year -old children who could not produce 
[s] and [J] are summarized in Figure ^» which should be compared with Figures 
1-3 from Experiment 1. We have combined the results across children who omit^ 
ted [s] and [J] (N = 8), those who substituted one for the other (N s 4)^ and 
those who substituted^ another phoneine instead (N = 2), the nature of 
production errors did not appear to influence the pattern results. As in 
the first experinenti all subje'cts labeled the words spoken during training 
with an accuracy of 100% correct* and also labeled the endpoint test stinuli 
with an 80f or better accuracy. Thus* they could clearly distinguish good 
exemplars of [5] and [J]. Inspection of Figure H further reveals that these 
children also showed vocalic context effects on fricative perception. When 
the stinuli along the synthetic continuum were followed by the vocalic portion 
from ''shavei** the average phonetic boundary lies between stimuli 5 and 6 
(5*2), whereas that for the same fricative noises followed by the vocalic por- 
tion from •'shoe'* lies between stimuli * and 5 (**.3). Thus, fewer ''sh*' re- 
sponses were t ' ven in the context of the rounded vowel than in that of the 
unrounded one, t(26) = 3.79, £ .005. 
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^. Influence of vocalic context on the labeling of fricative noises by 
seven-year -olds who cannot articulate [s] and C/]. 
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There are two points to be made in discussing these findings. The first* 
and most central to our concern* is that perception of Cs] and [/] by children 
who cannot produce both of these phonemes does not differ signficantly from 
that of adults and children of the same age who can produce them. That is* 
their perception of [s] and CJ] is affected by vocalic context in the same 
manner (i.e.* more **sh** responses in the context of the unrounded vowel) and 
to an equivalent extent. Thus* it would appear that these children can take 
account of the consequences of coarticulation ^f a fricative with a following 
vowel* even though they do not directly control those consequences in their 
own speech production.^ 

A second point* more pertinent to clinical concernb* is that the 
exclusive problems with fricative articulation that distingi^ish the children 
of our second experiment from those of the first experiment do not appear to 
be due to aberrant perceptual abilities. This is a conclusion that has been 
reached in several previous studies of children selectively inpaired in 
producing liqvdds (Strange & Broen* 1981). Perhaps^some developmental delay 



in motor control is the cause of selective misarticulation 



of fricatives and 



affricates* given the distinguishing developmental characteristics of this 
clas9 as outlined by Ingram et al. (1980). Fricatives ar^ avoided by many 
very young children* and it is not impossible that certain children merely 
avoid them longer than others. Also* since fricatives are aoong the last pho- 
nemes to be produced correctly (and there seems little agreement on tne span 
of time involved in acquisition of other phonemes* much less this controver- 
sial class)* there is ample reason to suspect t^at tnany of the children who 
participated In the present experiment are following a normal pattern* albeit 
more slowly* of phoneme acquisition. ^ 

However* befora leaving thi3 second point* we would like| to recognize the 
possibility that certain severe articulatory problems coulld be based in a 
perceptual disorder (Strange & Broen* 1981). In this regard! we note that we 
have examined a group of seven**year^ld children who present with multiple 
articulatory problems spanning three or' more manner classes* ^nd we have found 
them to be quite different from children who selectively mlsiirticulate frica- 
tives and affricates^ (Hann* Dorman* Strawhun*^ & Sharlin* 1 1982; Sharlin* 
1982). Subjects who are multiple misarticulators give responses that tend to 
be more erratic; their attentiv«ness is also noticeably lowe?\and they ••fidg- 
et** more than the other children nl)om we have tested. They behave as if our 
task is in some way unexpectedly aversive* owing* perhaps* to m inability to 
competently and confidently make the required perceptual distinction. In 
addition* and most notably* t lese children are unique in their tendency as a 
population to show no significant effect of vocalic context on fricative 



perception. ^ \ 

General Discussion 

The following general conclusions can be drawn from thb results of 
Experiments 1 and 2: 1) Children as young as five years of age who correctly 
articulate Cs] and [/] show vocalic context effects on fricatiye perception 
that are commensurate with the context effects observed among adCflt subjects; 
2^) Competent production of Cs] and CJ] is not necessary for the manifestation 
of vocalic context effects on fricative perception; 3) TWe exclusive 
misarticulation of fricative consonants* like other selective Iproblems in 
speech production (see Strange & Broen* 1981)* is not simply attributable to 
deficits in fricative perception. 
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We may now turn to a consideration of our findings as they pertain to the 
various hypotheses^ outlined In the Intro ductlon > dbout the source of the tac- 
it knowle(|ge of articulation that we hold responsible for the Influence of Vo* 
calic context on fricative perception^ and that we presuoke to be guiding 
mature speech perception* Certainly we may reject the hypothesis that experi- 
ence with the production of fricatives Is essential to the acquisition of suc^t 
knowledge that allows listeners to compensate for the consequences of 
fried tlve-vowel coartlculatlon on fricative noise spectra* Otherwise^ we ) 
should not have found vocalic context ef f ects - to be equally present In the^ 
perception of children who can and cannot produce [s] and C/J* Even children 
who selectively omit fricatives altogether (of which we tested 8) showed vo- 
calic context effects on fricative perception equivalent to thos^ among other 
children and adults* 

This leaves us to consider the remaining three possibilities about the 
basis of tacit artlculatory knowledge* One possibility Is that» while there 
Is no simple one^o-one dependence of knowledge about the consequences of 
f rl^catlve-vowel coartlculatlon on conpetent production of' [sj and some 
experience with language production may be essential to the acqulslton of that 
knowledge; for example^ experience with producing rounded and unrounded vow- 
els and observing their different consequences on sound spectra* In general* 
A second possibility Is that « tacit "^artlculatory knowledge does not emerge 
through feedback from one's own artlculatlqn so much as through experience 
with listening to» and perhaps watching^ the articulations of others* A third 
Is that tacit knowledge is not Induced by experience with one's own articula- 
tion or that of others* but Is genetically given lo as to be present and func- 
tioning by the age of five years* before successful fricative production (con* 
tlngent* perhaps* on some type of auditory stimulation)* Each of thestr possi- 
bilities Is equally consistent with the present findings* but* as we shall now 
argue* they are not equally consistent with certain other findings In ^ the' 
literature* j 

Considering each possibility In tum» we note that the first Is incon- 
sistent with reports that subjects who lack speech production abilities may 
nonetheless demonstrate apparently normal speech perception (Fore In* 197**)* 
and with a report by Whalen (1981) who shows voca]^'' context effects for 
nonHiatlve vowels* Hweve^* befor^ concluding that feedback ?rom one's own 
articulation Is not a prerequisite for acquiring tacit knowledge about artlcu* 
latlon and Its consequences* It would be desirable to repeat the present 
study* using subjects with total congenital Inability to speak* 

We turn nex^ to the second ^hypothejls* which stresses experience with the 
articulation of others. While this Is consistent with the perceptual 
capabilities of Inarticulate subjects* and with the late onset of certain 
speech perception abilities* It Is at odds with findings that neonates display 
adult-like discrimination of many speech - sounds' (see* for example* Ellers* 
1980* and Miller ( Elmas* In press* for reviews)* One might test this hypoth* 
esls.by studying children who have recently been corrected for a congenital 
hearing loss* or by examining congenltally blind children who have not had the 
opportunity to observe the xlp-roundlng gestures of others* If subjects In 
these groups show normal vocalic context effects on fricative perception* It 
would suggest that experience with the articulations of others does not have a 
<^rltlcal role In Instantiating knowledge of articulation and Its consequences* 
However* finding that such children fall to :ifhow context effects might be 
Interpreted either as evidence that experience Instantiates artlculatory 
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knowledge^ or as evidence that experience merely facilitates or maintainr. such 
knowledge (Gottlieb^ 1976)^ Testing neonates and young children could ulti- 
mately decide between these two possibilities^ 

The third and final possibility is that tacit knowledge of articulation 
and ^ts , consequences is genetically endowed^ as opposed tc deriving* from 
experience with the consequences of one*s own articulation or with those of 
others^ This hypothesis is consistent with findings that neonarbes prefer to 
look at a face articulating the same vowel that they are hearing (Kuhl & 
Heltzofff 1982)» and also display a wide range of speech perception. behaviors 
that are directly analogous to the perceptual capabilities of adult speech 
perceivers» including certain context' effects (HiUer & Eimas* in l^ress). It 
is further consistent^ with evidence thatr although infrahunan ^lecies 
discriminate certain speech sounds much as human listeners do Ccf. Kuhl & 
Miller, 1978; Waters & Wilson, 1976), and may even categorize fricajt^ive 
noises along an [sD-CX] continuum (?evchik, 1979), they fail to show the pre- 
sent vocalic context effects on fricative perception (Sevchik, *<979)p If 
context effects that involve tacit knowledge of articulatory dynamics ar^ 
unique to human listeners, then it is likely that the knowleflge they de^vend on 
is genetically based^ 
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TRADING RELATIONS AMONG ACOUSTIC CUES IN SPEECH PERCEPTION: SPEECH-SPECIFIC 
BUT MOT SPECIAL* ^ 

Bruno H. Bepp 



< The perception of mo9t« if not all« phonetic distinctions is sensitive to 
multiple acoustic cues. That iSt there are several distinct aspects the 
acoustic speech signal that enable listeners to distinguish between* for exam^ 
ple» a voiced Md a voiceless stop consonant^ or between a fricative and an 
affricate. Although soM cues are more inportant than others for a given 
distinction* list^eners can usually be shown to be sensitive to even the less 
important cues when the primary cues are removed or set at ambiguous values. 
All cues that are relevant to a given phonetic contrast seem to carry informa- 
tion for listeners. 

The relevance of a cue can be predicted from comparisons of typical 
utterances exemplifying the phonetic contrast of interest. Any acoustic prop- 
erty that systematically covarie^ with a phonetic distinction may be consid* 
ered a relevant cue for that distinction and may be expected to have a 
perceptual effect when the conditions are appropriate. 

In many recent speech perception experiments several acoustic cue dimen^ 
sions have been varied simultaneously. Provided the cues are adjusted so that 
each has an opportunity to influencl^ the perception of the relevant phonetic 
distinction* it ccin easily be demonstrated that a little more of one cue can 
be traded against a little less of another cue* without changing the phonetic 
percept. This is called a phonetic trading relation . 

The perceptual equivalence of acoustically different stimuli obtained by 
trading two cue dimensions goes beyond the mere equivalence of response 
distributions. As several recent studies have shown« these stimuli are very 
difficult to tell apart in a discrimination task. Thus the trade-off among 
the cue dimensions takes place entirely without the listener*s awareness* and 
only extensive auditory discrimination training might reveal the differences 
that exist at the auditory level. 

Phonetic trading relations are a ubiquitous phenomenon. Whenever two 
acoustic cues contribute to the same phonetic distinction* they can also he 
traded against each other* within a certain range. Thus» these trading rela- 
tions are a manifestation of a more general perceptual principle of cue 
integration ^ by which I mean the assumption that* in phonetic perception* the 



•This position paper was presented at the Tenth International Congress of 
Phonetic Sciences in Utrecht* August i983. 
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infornation conveyed by a variety of acoustic cuea ia integrated and oombiued 
into a unitary perceptual experience that can be described in terms of 
linguistic categories* But what causes the various cues to be integrated and 
to trade with each other? 

One curr-^nt school of thought holds that the integration of cues and the 
ensuing trading relations are due to auditory interactions of one sort or an* 
other* Proponents of this hypothesis* while ready to admit that the 
psychoacoustics of coflq>lex speech signals are not yet well understood* 
nevertheless believe that phenomena knoHn from research with nonspeech stimu* 
lit such as acditory adaptation* lomskingi c " integration* can account for 
trading relations in speech* Oppou«^nta of this hypothesis^ on the other hand* 
like to point out the great acoustic diversity of the oues involved and. their 
distribution ove^ considerable temporal intervals* Obvioudly« and eapecially 
as far as specific trading relations are concerned* this dispute can only^be 
settled by empirical reS4^arch* A number of recent experiments have ac^dressed 
this issue* employing several diiierent techniques* but which are alluded to 
in my abstract* I do not have the time to summarize the results here; 
suffice it to say that the available evidence suggests that many i^onetic 
trading relations ocr jr only when listeners engage in the phoned c classifies* 
tion of speech signals* and not when they identify analogous nonspeech stimuli 
or discriminate auditory properties of speech* Thus these trading relations 
seem to be a product of phonetic categorization* not of interactions in the 
auditory system*. This is not to say that auditory interactions do not occur 
in speech signals* although it is possible thaf* due to the intimate 
familiarity' of listeners with speech* such interactions havf^ less of a 
perceptual impact than in less familiar nonspeech stimuli* Certain e^f^-cts of 
irrelevant signal properties on phonetic perception do seem to require a 
psychoacoustic explanation* And indeed* so p^e of the many trading relations 
that, now appear ^' be phonetic in origin may eventually be proven to rest on 
an auditory interaction* It seems extremely unlikely* howevl^r* that all of 
them will be so explained* 

The reason for this prediction of mine is that psychoacoustic approaches 
to speech perception often seem to ignore a crucial fact~that phonetic class* 
ification takes place yith reference ^ norms established through past experi* 
ence with a language* Although this experience has been filtered and trans* 
formed by the constraints and nonlinearities in the auditory jystem through 
which it had to pass* the current input undergoes precisely the same transfer- 
mat ions t so that the topological relationship between it and the internal 
representation of past experience remains essentially unchanged* It is this 
relationship that determines the phonetic percept by a principle of Proximity i 
The input is perceived whatever it res^oDles most in the past experience of 
the individual* There iSt of course* much more to be learned Ebout the 
perceptual met *lc that relies speech stimuli and the representations of 
phonetic categories in the listener's mind* and aOditory nonlinearities may 
indeed 'itifluence that metric* The essential point* howevert is that the 
perception of phonetic categories derives from a relationship* and , not from 
any properties of the acoustic signal per se* Neither the-^elevance nor the 
perceptual iav>ortance of acoustic cues can be predicted fri^ an inspection of 
the input alone* Rather* the integration and weighting of the cues is a 
perceptual function based on £ relatic^ship of input to'k^wledge within the 
speech domain* Phonetic trading relations are* therefore* necessarily a 
speech*specif ic phenomenon* even If certain individual trading relations could 
potentially (or do in fact) arise from auditory interactions* As we learn 



130 




Repp: Trading Relations Among Acoustic Cues 



more about the peripheral auditory transformations of speech signals* we may 
eventually be able to redefine the perceptual cues In a way that makes the 
trading relations among them exclusively phonetic. 

Having argued for the speech-speclf Iclty of phonetic trading relations, I 
would now like to address the question of whether the perceptual Integration 
of phonetically relevant cues Is achieved by some special machinery or proc- 
ess i or whether It reflects a general principle of perception. In the past. 
It has often been argued that speech perception makes reference to speech 
production ! and that perceptual processes actually make use of some of the 
neural networks ergaged In articulation. This certainly reiualns an Interest- 
ing and Important hypothesis at the neurophyslologlcal level. To the 
perceptual theorist, however. It really ^ould be a truism: Since speech 
t^erceptlon occurs with reference to Internal criteria based on language 
experience, and since language Is produced in a systematic manned by human vo* 
cal tracts, the^ listener's internal re|^resentatlon of past experience with his 
or ,her language necessarily embodies artlculatory constraints as well as Ian* 
gudge 'Specif Ic characteristics. In other words, I would like to argue that 
speech perception must reflect the way speech Is produced because the criteria 
for perceptual classification are the production norms of the language. To 
say, therefore, that speech perception refers to speech production Is merely 
to state the obvious. 

A more specific hypothesis regarding phonetic trading relations might be 
& proposed, however. It might be argued that many Individual cues thac trade In 

perception also trade in production. In the^sense that there Is a continuous 
tiovarlatlon of the two acoustic cues, due to ^ome artlculatory reciprocity, 
even within phonetic categories. If It were the case that perceptual trading 
relations are obtained only for cues that show such continuous covariation In 
production* then It might be argued that speech perceptlorr makes' use of 
specific knowledge of patterns of artlculatory variabilit y, and since the 
brain presumably cannot .store an Infinity of variants. It might be Inferred* 
that reference Is made to an Internal representation of the artlculatory mech- 
anism that enables listeners to generate specific cue ^relationships. Although 
this hypothesis needs to >e explored in greater depth, It seems me that the 
continuous covariation of cues In production should not be a necessary condl* 
tlon for perceptual trading relations tp occur. ^ All that Is required I3 that 
typical Instances of two different phonetic categories differ along two or 
more acoustic dimensions. It I5 much piore plausible and parsimonious to 
assume that the listener *s brain retains a record pf typl<?al Instances of 
utterances, that Is of the central tendencies In the variability encountered, 
rather than of the variability Itseir^. While this system of phonetic category 
prototypes must be adjustable to the changing characteristics of ongoing 
speech, at any given point in time It provides the stable reference points 
that guide^speech perception. 

From this broad vantage point, phonetic classification Is a form of pat- 
tern recognition. Speech signals may be thought of as points or traces In a 
multidimensional auditory space that also harbors the appropriately tuned cat- 
egory^ prototypes, and phonetic categories are selected on the basis of some 
distance metric. Trading relations at^iong the various acoustic dimensions of 
this auditory-phonetic space are an obvious consequence. What makes speech 
special. In this view, is not the processes or mechanisms employed In Its 
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perception but the ujique structure o£ the pa tterns that are to be recognized, 
which reflect tn turn the 'special properties of the production apparatus and 
the Idnguaee^specific conventions according to which it is operal^d. 

In summary, then, I have argued that, on the one hand, phonetic trading 
relations are speech^specific but, on the other hand, they are not special as 
a phenomenon. They are speech-specific because their specific form can only 
be understood by examining the typical patterns of a language. They are not 
Special because, once the prototypical patterns are known in any perceptual 
domain, trading relation:^ among the stimulus dimensions follow as the inevit^ 
able product of a general pattern matching operation. Thus, speech perception 
is the application of general perceptual principles to very special patterns. 
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THE ROLE OF RELEASE BURSTS IN THE PERCEPTION OF [3]^T0P CLUSTERS* 



Bruno H. Repp 



Abstract . The role of the release burst as a cue to the perception 
of stop consonants following [s] was Investigated In a series of 
studies. Experiment 1 demonstrated that silent closure duration and 
burst duration can be traded as cues for the "say^^^stay** distinc- 
tion. Experiment 2 revealed a similar trading relation between clo- 
sure duration and burst amplitude. Experiments 3 and 4 suggested^ 
perhaps surprisingly^ that absolute* not relative* burst amplitude 
Is Important.* Experiment 5 demonstrated that listeners' sensitivity 
to bursts In a labeling task is at least equal to their sensitivity 
In a burst detection task. Experiments 6 and 7 replicated the trad- 
ing relation between closure duration and burst amplitude for labial 
stops In the "sllf-^^spllt" and ^^slash^-^splash** distinctions* al- 
though burst amplification* In contrast to attenuation* had no ef- 
fect. All experiments revealed that listeners are remarkably sensl* 
tlve to tho presence of even very weak release bursts. 

^ Introduction 

A large proportion of speeph perception research has been concerned with 
stop consonants. Nevertheless* there are still gaps In our knowledge of the 
relevant acoustic cues and their perceptual liqportance. While much attention 
has been lavished on the perception of stop consonant voicing and place of ar- 
ticulation* the more basic question of whether or not a stop consonant Is per- 
ceived at all has been addressed In only a handful of studies. Moreover* 
nearly all of these studies have used synthetic speech stimuli In which at 
least one Impcrtant cue was commonly absent: * the release burst that 
terminates the stop closure. The present series of studies explores the role 
of this cue In the perception of stop consonants after Cs]. 

A good deal is known about some other cues to stop manner perception* at 
least In the context of preceding Cs] ^nd following vowel or [1], One very 
linportant cue Is an Interval of silence corresponding to the period of oral 
closure tnat characterizes ' stop consonant articulation. Early research at 
Haskln? Laboratories by Bastlan (1959* 1962) as well as the recent thorough 
Investigations of Bailey apd Summerfleld (i980) have shown that an Interval of 



•Alsu Journal of the Acoustical Society of America* In press. 
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silence between an [s]*nolae and a ateady-^tate synthetic vowel is generally 
sufficient to elicit a stop consonant ^percept» given that the silence is long- 
er than about 20 ms {but not excessively long)* and that the vowel is not too 
Open, Silence frequently is not only s*Jfficient but also necessary for the 
perception of a stop, for even when other stopjmanner cues are present in the 
signal (neglecting release bursts for the moment), stops are rarely perceived 
in the absence of an appropriate closure interval (Bailey & Sumroerf ield, 1980; 
Best» Morrongiello, & Robson, 1981; Dorman, Raphael, A Liber rnan» 1979; 
Fitcht Halwes^ Erickson, & liberman, 1980), 

Other relevant cues reside in ttie signal portions adjacent to the closure 
interval. Changes in spectrum and/or a rapid amplitude drop in the precedjji^^ 
fricative noise signify the approach of the closure and thereby contribute to 
stop perception (Repp» unpublished data; Summerfield» Bailey, Seton, & Dor- 
man, l98l). Similarly^ formant transitions and/or a rapid amplitude rise at 
the onaet of the following vocalic portion-^ rising transition of the first 
formant (Fl) in particular— signify rapid opening and thereby constitute an 
important stop manner cue (Bailey & Sunnerf ield, 1980; Best et al*, 1981; 
Fitch et al*, 1980), There is also evidence that the durations of the acous* 
tic aegoients preceding and following the closure can influence stop manner 
perception (Sumnerfield et al.» 1981; however^ see also Harcus, 1978)* These 
additional cues engage in trading relations with the temporal cue of closure 
<luration; that is, the stronger they are» the less closure silence is needed 
to perceive a atop* (For analogous findings for stops in vowel*[s] context, 
see Dormant Raphael^ & Isenberg» 1980*) In general^ however^ these studies 
suggest that a minimal amount of silence (about 20 ms) is needed fot a stop to 
be perceived at all* 

Nearly all of the aoove^nentioned studies used synthetic speech stimuli 
that djd not include any release bursts. One reason for this omission w«is 
presumably that good bursts are difficult to synthesize* Although most 
researchers are probably aware of the relevance of release bursts to the 
perception of stop manner, the importance of this cue has not been sufficient- 
ly acknowledged in the literature, which has emphasized the role of the clo- 
sure duration ci*e. In an unpublished study. Repp and Hann (1980) took three 
tokens each of Csta], Cska], [Ita]» and [Ika]» produced by a male speaker^ ex- 
cised the closure period^ and replaced the natural fricative noises with 
synthetic ones of comparable amplitude* In one condition^ the stimuli re- 
tained the natural release bursts, and the subjects continued to report stop 
consonants on 100 percent of the trials, with very few place-Of*articulation 
errors* In another condition^ the release bursts were excised, and stop re- 
sponses fell to 3 p**rcent (except for two subjects who continued to report 
stopst but with poor accuracy for place of articulation)* These data clearly 
i]lustr«t the salience of the release burst as a manner cue for alveolar. and 
velar stops followir.g fricatives* Labial stops, on the other hand, are 
associated with weaker release b'trsts (see Zu€, 1976) that may not be suffi* 
clent to cue a stop percept in the absence of an appropriate closure interval* 

The present series of studies atteiif)ts to answer several questions about 
the role of release bursts in stop manner perception: (1) Given that an in- 
terval of silence is needed to hear an alveolar stop when there, is no release 
burst but not when there is one, how much can the burst cue be weakened before 
any silence is needed, and will further weakening of the burst result in 
increasing amounts of silence required? In other words, how sensitive are 
listeners to burst cues» and is there a regular trading relation between the 
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burst and silence cues? These questions are explored In Experiments 1 and 2 
by manipulating alveolar burst duration and amplitude. (2) Given an effect of 
burst amplitude that can be traded against silence duration* Experiments 3 and 
4 investigate whether It is absolute or relative burst ao^lltude that matters. 
(3) Experiment 5 addresses the questfon of whether the point at which an 
attenuated release burst ceases to trade with silence col^ncldes with the audi- 
tory detecT^lon threshold for the burst. (U) The role of burst ainplltude is 
further Investigated in Experiments 6 and 7 with labial stops, with special 
attention to the question of whether amplification of a weak labial burst can 
make It a more powerful manner cue. 

Experiment 1 

The purpose of Experiment 1 was to demonstrate the relative importance of 
an alveolar release burst as a stop manner cue, and to create a trading rela- 
tion between burst and. silence cues by varying the durations of both In natur- 
al-speech stimuli. ^ 

Method 

Stimuli . Good tokens of ^•say*' and ^•stay^'were selected from recordings 
of several repetitions produced by a female speaker In a sound*lnsulated 
booth. These two utterances were low-pass filtered (*3 ctB at 9-6 kHz, *55 dB 
at 10 kHz)» digitized at 20 kHz» and modified by waveform editing. To reduce 
stop manner cues in the fricative noise, which were not of particular Interest 
In the present study* the [s]-nolse of ''say,*' 176 ms In duration, was used In 
all experimental stimuli. This noise was followed by a variable Interval of 
silence and by one of seven different, '•day^'-llke portions, roughly 550 ms In 
duration. Six of these were derived from the token of '•stay*' while the sev- 
enth represented the vocalic portion of the '•say*' token. 

Figure 1 shows the waveforms of the onsets of these stimulus portions. 
On top Is the original post-closure portion of "stay,** which began with a 
rather powerful (but, for that speaker, not atypical) ^release burst of some- 
what less than 20 ms In duration. The rms amplitude of the total burst was 
determined to be 4,6 dB below the vowel onset and 6.8 dB below the vowel peak 
(135 ms later), with an amplitude decrease of about 10 dB from the Initial to 
the final quartlle of the burst. The release burst was cut back In five 
steps, as Indicated In the figure. Successive cuts (versions 2-6) were made 
at 6.1, 10.6, l3-4> 15.2, and 19-6 ms from the onset. These cutpolnts were 
selected visually on the basis of local dips in the waveform. In each case, 
the cut was made at the nearest zero crossing. The stimulus portion derived 
from **say** Is shown at the bottom of Figure 1, aligned so as to show Its 
similarity with version 5 of the **day** portion on top. Despite this similari- 
ty of waveforms, however, there were presumably some spectral differences be- 
tween these two portions, due tc the different contexts In which they had been 
articulated. 

The silent Interval separating the Initial fricative noise from the *'day*' 
portions was varied from 0 to 60 ns In lO^ms steps. Because tokens with large 
bursts were expected to be perceived as *»stay*' even without any silence, a se- 
ml^rthogonal design was employed that assigned an Increasingly wider range of 
silence durations to tokens with Increasingly shorter bursts. Thus the stimu- 
li with the most powerful burst occurred only with the O^ ms s Uen cet while the 
stimulus derived from **say** occurred with all seven silent Intervals. This 
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Figure 1. 



(8)TAY 



Figure 2. 
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Onset waveforms of stimuli used In Experiment 1. The top panel 
shows the first 40 ms following the closure In **stay**; the bottom 
panel shows the first 24 ms following tM fricative nolse^ln "say**. 
Arrows In the top panel Indicate cut points for release burst trun* 
cation. 

loorr ^ 




CLOSURE DURATION (ms) 

Tradl:;^ relation between alveolar release burst duration and clo- 
sure deration (Exp. 1). Numbers refer to cut points Illustrated In 
Fi^ re 1. The dashed line represents the token derived from **say**. 
Closure duration (abscissa) refers to the actual silence In the 
stimuli. 
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led to a total of 28 different stimuli that were recorded on audio tape in 10 
different randomizations, with Interstlmulus Intervals of 2 s. 

Subjects and procedure . Ten subjects participated, Including nine paid 
volunteers and one member of the laboratory staff (not a speech. researcher). 
None of the subjects reported any hearln g problen-s , and all had only very 
limited experience in speech perception experiments. The stimuli were 
presented blnaurally over calibrated TDH-39 earphones In a quiet room.^ The 
subjects Identified In writing each stimulus as either **say** or **stay.** 

Results and Discussion 

Average percentages of **3tay** responses are shown as a function of silent 
closure duration In Figure 2, separately for each of the seven stimulus pat* 
terns. It Is evident that versions 1, 2^ and 3 were Invariably Identified as 
**stay;** even In the absence of silence. Thus, even the remainder of the burst 
following the Initial hlgh^mplltude portion (version 3* see Figure 1) was a 
sufficient cue for 3top manner. As the burst was cut back further* Increasing 
amounts of silence were necessary to achieve a percept of **3tay.** The stimulus 
with the '^say'^-derlved portion yielded results similar to those for version 6, 
and It appears that neither provided sufficient cues for unambiguous **stay** 
percepts, even at the longest silences used here. 

What Is most striking about these results Is the large perceptual effect 
that a small burst cutback had on perception* The change from version ^ to 
version 5 consisted of the elimination of only 1.8 ms of relatively low-ainpll- 
tude noise at onset (see Figure 1); howeverf listeners needed approximately 
10 ms aiore silence to coaq;>ensate for this loss and achieve the same average 
rate of "^stay** responses. Similarly, the change from version 5 to version 6 
consisted of the elimination of the last 4.^ m3 of burst residue. The 
perceptual effects were dramatic: At least 20 ms of additional silence were 
needed to compensate for the loss, and several listeners were not able to 
Compensate for it at all, reporting only **sa;*** for version 6. Even those feu 
subjects who did reach a 100-percent **stay** asymptote for>verslon 6 and had 
very steep labeling functions showed large effects of the stimulus manipula- 
tions. 

Thus, this study not only demonstrates a perceptual trading relation be- 
tween burst duration and silence duration but also that listeners are remark* 
ably sensitive to what seem to be rather minute changes In the onset 
characteristics of the stimulus portion following the silent osure Interval. 
Of course, the truncation of the release burst Introduced not only variations 
m burst duration but also changes in overall burst amplitude, In its onset 
amplitude characteristics, and perhaps correlated spectral changes. Any of 
tnese may have been responsible for the effects observed, but It Is still true 
that relatively small physical changes had relatively large perceptual conse- 
quences. 

Experiment 2 

Experiment 2 examined one parameter that may have played a role In 
Experiment l^he overall burst ao^lltude. The purpose of the study was to 
demonstrate a trading relation between release burst an^lttJde and closure 
duration as joint cues to stop manner perception. 
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Method 

Sttmult . In Eiperlnient 1, stimulus version 3 was /just on the verge of 
requiring some silence in addition to the truncated burst* in order for a stop 
to be perceived on all trials (see Figure 2). This stjlmulus was chosen as the 
starting point. Its resl<}ual burst was 9 ms in duratjlon (see Figure t)» with 
a total rno amplitude 10.8 dB below the vowel onset and 15.1 dB below- the 
vowel peak. Five additional versions were created m digitally attenuating 
the burst by up to 30 dB in 6-4B steps. In a seventn version the burst was 
infinitely attenuated (i.e.* it was replaced with 9 ms\j>f silence); thus] this 
stimulus was equivalent to stimulus version 6 in Experin^nt 1. 

Silent intervals ranging from 0 to 60 ms in IOmds st^ps-^:^e^^a^ieiied to 
the stimuli using the same design as in Experiment 1. Thus* versdfo^ 1 oc- 
curred only with the O-hus interval while version 7 occurred with' the full 
range of closure durations. The resulting 28 stimuli were recorded in 10 dif- 
ferent 'randomizations. 

Subjects and procedure . Twelve new subjects participated in this study. 
The data of one had to be discarded because he could not reliably distinguish 
among the stimuli. The remaining eleven subjects included eight staff menbers 
of Haskins Laboratories (including the author) with 'varying amounts of experi- 
ence in speech perception tasks* ancf three paid student volunteers. The pro^ 
cedure was the same as in Experiment 1. 

Results aim Discussion 

Average percentages of ^stay^ responses are shown as a function of silent 
closure duration in Figure 3» separately for each of the seven att^uation 
conditions. It is evident that there is en orderly progression of labeling 
functions: As the burst got weaker* more silence was needed to perceive a 
stop consonant. 

The figure suggests t^at a burftt attenuated by as much as 30 dB still led 
to more stop responses than a stimulus without any burst. This was confirmed 
in a one-way analysis of variance on the stop responses to these two lypes of 
stimuli* sunned over closure durations of up to ^0 ma* F(l*10) = 9.8* £ < .02. 
Since* in the 30-dB attenuation condition* the amplitude of the 9hds residual 
burst was about ^5 dB below the vowel peak amplitude (or at about 3^ dB SPL 
versus about 83 dB SPL for the vowel at the subjects' earphones)* this finding 
again reveals that listeners are remarkably sensitive to burst cues. 

Two additional comments are in order concerning Figure 3. First* it 
should be noted that* in the infinite ?*'tenuation condition* the non^n^l clo- 
sure ended at the beginning of the nonexistent burst. Therefore* the actual 
duration of the sil^ce in these stimuli was 9 ms longer* as indicated by the 
arrows in the figure* which makes the results more nearly comparable to those 
for the same stimulus (version 6) in Experiment 1 (see Figure 1). It would 
not have been appropriate to plot these data in terms of actual silence dura-* 
tion because the effective silence durations resulting from various degrees of 
burst attenuation are not known. Note* however* that such a plot would tend 
to space the functions in Figure 3 farther apart and thus increase the ob- 
served effects. This distinction between nominal and actual closure duration 
will recur in later experiments. 
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CLOSURE DURATION (ms) 

Figure 3.. Trading relation between alveolar release burst amplitude and clo- 
sure duration (Exp^ 2). Neaative nuidbers refer to amplitude decre- 
ment (In dB). Closure duration (absciss?) Is nominal; the actual 
silence durations In the Inflnlte^ttenuatlon condition were 9 ms 
longer due to the silenced burst, as Indicated by the arrows. 



Second, it will be noted that, contrary to expectations based on Cxperl- 
luent 1, the unattenuated stimulus did not receive tOO percent **stay** re- 
sponded, while the burstless stimulus did reach this asymptote at the longer 
silences. Although there was considerable variability among Individual sub- 
jects with regard to how the unattenuated stimulus was perceived, the pattern 
of the data suggests that tha subjects gave somenhat more weight to closure 
duration and less weight to the burst In Experiment 2 than in Experiment 1. 
The reason for this is not known. 

In summary, the present ^tudy demonstrated the expected trading relation 
between burst amplitude and closure duration, and it showed that severely 
attenuated (and truncated) bursts still can have a perceptual effect. 

Experiment 3 

Given the finding of the preceding study that burst amplitude is an im* 
portant parameter. Experiment 3 addressed the question of whether the 
perceptually relevant aspect of burst amplitude Is Its absolute magnitude or 
Its magnitude relative to the surrounding signal portions. 

Method 

Stimuli . Taking the data of Experiment 2 as a guideline, the stimulus 
with the 12-dB attenuation of the 9-ms residual burst was selected as the 
starting point for the present study. Four other stimuli were created by 
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selectively attenuating portions of this original stimulus* as illustrated 
schematically in the upper right-hand corner of Figure In addition to (a) 
the original stimulus, there were stimuli with attenuation of (b) only the 
burst* (c) both the burst and the following vocalic portion* (d) the burst and 
the preceding fricative noise* and (e) the whole stioulus. Attenuation was by 
12 dB in all cases. ' ^ 

All stiBwli occurred with all closure durations* which varied from 0 to 
**0 ms in 10-ms steps. The re:>jlting 25 stinuli were recorded in 10 differertt 
randomizations. 

Subjects and [procedure . Ten subjects participated* including six new 
paid volunteers and four staff members of Haskins Laboratories (including the 
author). Results were similar for the two groups of subjects and were com- 
bined. One subject reported only "say" during the first half of the test* so 
only her data from the second half were included. The procedure was the same 
as in Experiments 1 and 2. 

Results and Discussion 

The labeling functions for the five conditions are drawn in the top panel 
of Figure Clearly* the stioulus man^.pulations made a difference. Thi:: was 
confirmed by a one-way anaxysis of r^.riance on the percentages of "stay" re- 
sponses sunned over all closure durations* £(4,36) s 12.6* £ < .001. Statist- 
ical conparisons among individual conditions were done by post-hoc New- 
man-Keuls tests. According to these tests* condition (a) differed signif- 
icantly Cg^ < .01) from all other conditions* and condition (c) differf"* < 
.05) from condition C^). 

A graphic conparison among conditions is provided in the bottom Part ol 
Figure in terms of the location of the average "say"-"stay" boundary (ob- 
tained by linear interpolation between the data ^oi^its straddling the bound- 
ary) on the closure duration dimension. Proceeding from left to right through 
the five panels* we see the following: (1) Attenuation of the fricative 
noise* holding the other stioulus compot>ents constant* increased the number of 
stop responses slightly (i.e.* the boundary shifted to a shorter silence dura* 
tion). (2) Attenuation of the bu^st dt^srea^^^^d stop responses substantially^ 
which replicates Experiment 2. (3) Attenuation of the voiced portion resulted 
in a slight decrease i" stop responses. {4) Attenuating both the frioative 
noise and the voiced porti n together bad abscxutely no effect. (5) Attenua- 
tion of the whole stimlud caused a s^ibstantlal decrease in stop responses 
equivalent to that resulting from attenuation of the burst alone* 

These results point toward absolute burst annJlitude as the relevant fac* 
tor. Clearly* attenuating the burst's environn»nt did not have the same ef- 
fect a3 amplifying (more precisely* restoring) the burst by the same amount 
(see Figure 3)- Contrary to expectations* attennniion of the vocalic portion 
did not increase stop responses. Perhaps* sd^itional stop manner cues con* 
Gained in that portion (initial formant transiticns and aiH)litude envelope) 
were weakened by the attenuation* thus counteracting the gain in burst sali* 
ence relative to its environment. If so* however, we are forced to conclude 
that the absolute amplitude of those cues matters* w. ich is nqually interest*^ 
ing. 
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Figure i». Design and results of Experiment 3. Labeling functions for the 
five conditions are provided on the upper left, with the key on the 
upper right. RectatiKles In the key represent schematically the "s" 
fricative noise, the "t" burst, and the "day" voiced portion. The 
height of the rectangles represents aoplltude relative to the base 

' stimulus (condlt'an a). At tfie bottom, comparisons among the vari- 

ous conditions are presented in terms of average category boundary 
values (In ras of closure silence). Lower-case letters refer to the 
key on top. 
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Another poaalblllty la that the preaent atudy suffered from floor effecta 
due to llatenera' Inability to detect the burst uhen It uaa attenuated. Thla 
uould explain uhy the largeat difference occurred between condition (a) and 
all othera. Note that condition^ (a) and (b) uere equivalent to the and 
24-kIB attenuation conditlona In Experiment 2. The average category boundarlea 
for theae conditlona uere at 9 and 17 ms, reapectlvely. In Experiment 2, and 
at 10 and 20 ms In Experiment 3-»a rather close agreement. Note alao that, In 
Experiment 2, a burat attenuated by 24 dB atlll had a algnlflcant perceptual 
effect. The agreement between Experlmenta 2 and 3 auggeata that th? abaolute 
atlmulua amplltudea uere almllari and that no floor effect occurred. 
Nevertheleaa* It aeemed advlaable to replicate the preaent reaulta ulth the 
burat amplitude aet at aomeuhat higher abaolute levela* and ulth Inclualon of 
a no-burat baaellne condition. 

Experiment 4 

^ Thla replication of Experiment 3 uaed new rtinull In a complete 2x3x2 

orthogonal dealgn. By Including buratleaa atloAili In the dealgn, It uaa 
poaalble to examine the effecta of fricative noiae iind vouel attenuation aepa- 
rate from theJr effecta on the relative aallence of the burat~an Important 
control condition. 

KeU 

stimuli . Good tokena of **aay** and **stay** uere selected from among sever- 
al repetitions recorded by a neu female apeaker. BoVi utterances uere dlgl^ 
tlzed at 20 kHz. A^n Experlmenta 1*3> the fricative noise (170 ms long) uaa 
taken from **aay.^ The *«day^ portion of **atay** uaa about 450 ms in duration 
and began ulth a releaae burst 13.35 ma long. The overall rms amplitude of 
thla burat uaa determined to be 5.5 dB belou the vouel onaet, 11.0 dB belou 
the vouel peak (only 20 ms later), and 4.1 dB above the fricative nolae maxl* 
mum. Informal llatenlng conflrnked that thla burat , aa uaual, uaa aufflclent 
for **atay** to be perceived ulthout any cloaure alienee (see alao Exp. 5). To 
be able to trade burat amplitude against alienee* the moat Intenae burat uaed 
uaa 15 dB belou the original, k total of 12 atlmulua veralona uere created by 
orthogonally combining three factora: fricative nolae attenuation (0 or 10 
dB;); burat attenuation (15 or 25 dB, or no burst at ^all), and **vowel** 
attenuation (0 or 10 dB). Each of theae 12 veralona occurred ulth five clo* 
aure duratlona ranging from 0 to 40 ms in lO-ms atepa. The resulting 60 atl- 
mull uere recorded In 5 different randomizations. 

SubJecta a nd procedure . Ten new paid volunteera Identified the atlmuU 
aa **aay,** **atay,** **spay,** or **avay.** The last tuo reaponse alternatives uere 
Included beca'jae the author* as a pilot aubject* had noticed a tendency to 
hear theae additional categories. The tape uaa repeated once, so that each 
aubject gave ten reaponaea to each stlimalus. 

Reaulta and Placuaslon 

Of the ten aubjects, three gave only **aay** and **stay** responses, uhlle 
the other seven uaed one or both of the additional res^frnse ca ^egorlea as 
uell* In the Initial analysis, all consonant cluster responaes uere pooled. 
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Since the burstless stlwll had been created by.oalttlng the burst rather 
than by infinitely attenuating it* 13.35 ms (the duration of the burst) nust 
be subtracted from their actual closure durations to coDf)are results directly' 
for stinuli with and without bursts. This ha^ been done graphically in Figure 
St where the arrows point toward the actual closure <turationa. The figure 
shows average labeling functions for the three burst conditions* averaged over 
fricative and vowel attenuation conditions. Clearly, the subjects gave Mnjt^ 
nK>re cluster responses to the stinuli w..th bursts than to those without . 
Elimination of the burst resulted in a flattening of the labeling run<^ion; 
40 ms of silence was not enough to make a burstless stinulus soupd like an 
unambiguous ^stay> The figure also shows the expected effect of the 10*<IB 
burst attenuation. It is clear that this experiment avoided the danger of 
floor effects; if anything* the burst aiplitudes were sonewhat too high. 

The effects of variations in fricative noise and vowel ar^litude, which 
were svaller than the effects of burst anplitude, are susMsrizeo in Table 1 in 
the form of response percentages averaged over all closure durations. A 
three-way repeated -measures analysis of variance (with the factors Burst, 
Fricative, and Vowel ) was first conducted on the total cluster responses, 
ignoring tHe Incomensurability of actual closure durations for stimuli irlth^ 
and without bursts. Thj.s analysis revealed, 1>esides the expected Burst ef- 
fect, F<2,18) 5 6*».8, £ < .001, significant main effects of both Fricative, 
F(1,9) = 9.6, £ < .02, and Vowel, F(l,9) ' 12.** £ < .01, as well , as . signif- 
Tcant interactions between Burst and Fricative, F(2,18) = 10.7, £ < .001, and 
between all three factors, F<2,18) = 12.1, £ < .001. To clarify the triple 
interaction, separate analyses of variance were conducted on stimuli with and 
without bursts. Stiiivli with bursts exhibited significant main effects of 
Burst, F(1,9) = 32.7, £ < .001, Fricative, F(1,9) = 45.2, £ < .001, and Vowel, 
F(1,9) ^ 10.4, £ = .01, as well as a marginal Burst by Fricative interaction, 
F(1,9) ^ 5.4, £ < .05, and a strong triple interaction, F(1,9) = 30.6, £ < 
Tool. Thus, the triple interaction was not due to different patterns of re- 
sults for stimuli with and without bursts. The separate analysis of burstless 
stimuli revealed only a significant effect of Vowel, F(1,9) = 8.5, £ < .02, 
not of Fricative. 

Consider now the directions of these effects. The Burst effect, of 
course, was due to a decrease of cluster responses as the burst was attenuated 
or eliminated altogether (Figure 5). The Fricative effect, too, was in the 
expected direction: Attenuation of the fricative noise increased the number 
of cluster responses. (A similar but nonsignificant trend was observed in 
Exp. 3.) This is the kind of effect that might be expected if the fricative 
noise reduced the salience of the burst through some form of auditory forward 
masking (see Oelgutte, 1980). This interpretation is supported by the finding 
that the Fricative effect was absent in burstless stimuli, where there was no 
burst, to be masked (see*Table 1). 

Turning now to the Vowel effect* it can be seen in Tatle 1 that attenua- 
tion of Che vocalic portion, like attenuation cf the fricative noise, resulted 
in an increase of cluster responses, contrary to a nonsignificant opposite 
trend observed in Experiment 3. Since this was true regardless of whether a 
burst was present or absent, the effect was apparently not due to release fr^ 
3 backward masking effect of the vowel on the release burst, or singly to an 
increase in the salience of the burst relative to the vowel. ^ 
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CLOSURE DURATION (ms) 

Figure 5* Effect of burst aiif>lltud« in £tperlment 4, averaged over other am- 
plitude conditions^ Numbers refer to burst amplitude (in dB) rela* 
tlve to original burst. Closure durations are nominal; actual 
silence durations In cue no^urst condition are Indicated by 
arrows^ 



Table 1 

Response pattern In Experiment 4, averaged over closure dur^tlons^ 
Stlwlus aa^lltude (dB) Response (percent) 



Burnt Fricative 


Vowel 


"say* 


"stay" 




"spay" 


Total dt 


-15 0 
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16.6 


79.6 




0.6 


' 63.it 


-10 
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6.6 


67. H 




2.4 


93.2 


0 


-10 




90.6 


1.0 


0.0 


91.6 


-10 


-10 


1.2 


97.4 


0.6 


0.6 


98.6 


-25 0 
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20.2 


57.it 


16.6 


5.6 


79.6 


-10 
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19.6 


57.2 


17.6 


5.4 


60.4 
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-10 


20.2 


711.6 


3.6 


1.4 


79.6 


* 10 


-10 


10.4 


86.0 


2.2 


1.4 


69.6 


no burst 0 


0 


67.6 


12.4 


10.0 


:o.o 


32.4 


-10 
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70.6 


7.0 


11.4 


10.6 


29.2 
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-10 


60.6 


25.11 


5.2 


6.6 


39.2 


-10 


-10 


61.6 


20.6 


6.6 


' 6.6 


36.2 
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The results are conpllcated by the triple interactlOD* which was due to 
the fact that* with the higher burst anplltudey fricative and vowel attenua^ 
tloD seemed to have Inddpendent effects whereas* with the lower burst ampll^^ 
tude* only slnultaneous attenuation of both produced an eCfect* An explana* 
tlon of this complex pattern Is beyond reach at the monient* 

In suanary* this experiment, in conjunction with Experiment 3» provides 
little support for a role of relative burst amplitude In stop manner percept 
tlon. While the preceding fricative noise may exert a slight masking effect 
on the burst, the amp^lltude of the following vocalic portion xem to have its 
perceptual effects primarily by changing the relative salience of cues con- 
tained in that portion itself. While the present data cannot be considered 
the last word on the issue, the poBslbillty of a fixed perceptual criterion in 
the amplitude domain deserves further attention, both with regard to the 
perception of stop manner and to place^f -articulation distinctions in stops 
(see Ohde & Stevens, 1983) and fricatives (Gurlekian, 1981). 

Experiment 5 

The preceding experiments, Experiment 2 in particular, demonstrate a 
remarkable sensitivity of listeners to the presence of even very weak release 
bursts. This suggests the hypothesis that the point at which a burst becomes 
Ineffective and ceases to trade with closure silence actually coincides with 
the auditc;y detection threshold for the burst. This hypothesis was tested in 
the present experiment.' In addition, the study examined whether the detecta- 
billty of the burst is Increased when the preceding fricative noise is re- 
moved* 

Method 

Stimuli . The stimuli were derived from the utterances that also provided 
the busis fcr the stimuli of Experiment 4, In addition to the original atimu* 
lus (full burst amplitude), six levels of burst attenuation were employed: 
10, 20, 25, 30, 35, and <^ dB, In the identification test , these seven stimu- 
li occurred with nominal closure durations of 0, 10, 20, and 30 ms. Ten dif- 
ferent randomizations of the 28 stimuli were recorded. 

In addition, two discrimination te s t a were assembled, which required sub* 
Jects to detect the pre^^ence of a burst* The two tests were identical except 
that in one the initial fricative noise was omitted from all stimuli while, in 
the other, the fricative noise was followed by a fixed lOnns closure interval* 
A fixed-standard same-different paradigm was employed* The fixed standard was 
the burstless stimulus; it occurred first in every stimulus pair. After a 
fixed interval of 500 ms, the comparison stimulus occurred; it either did or 
did not contain a release burst. Over six successive test blocks, the burst 
in the comparison stimulus uas attenuated by 0, 10, 20, 25, 30, and 35 dB. 
Each test block consisted of 50 trials, the first. 10 of which w$re practice, 
with the responses alternating between '^same'* and '^different'* and known in ad- 
vance. Half of the remaining 40 trials were '^aame** and half were '^different,'* 
in random order* The Intertrial inter^^al was 2 g. 

Subjects and procedure * Ten paid volunteers participated in the experi- 
ment, six of whom had also been subjects in Experiment 4, In the identlfica^ 
tlon test, which was always presented first, they responded ^say^ or '^stay,** 
with '^svay'* and '^spay'* as additional option:;. In the discrimination tests, 
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the responses were "s" ("sane") and "d" ("different"). The order of the two 
discrimination tests was counterbalanced across subjects. Playback anpli^ude 
was controlled by adjusting the level so as to achieve a constant uaxinun 
deflection on a vacuum tube voltmeter, and by keeping it at that level 
throughout the experiment. All tapes had been recorded at the same level. 
The peak anplitude of the vowel (and, hence, of the unattenuated burst as 
well— see Exp. 4) at the subjects' arphones was estimated to be approximately 
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Figure 6, Identification and burst detection results from Experiment 5. 

Filled squares plot the category boundary in ms of silence (right 
ordinate) as a function of burst amplittjde. Silence duration for 
the no^burst condition is nominal; actual silence at the boundary 
was 32 ms, as indicated. Circles shew burst detection scores as 
percent correct (left ordinate) for stitnuli i/ith and without ini* 
tial C3]**noise. 



Regultg ffld PigouggioR 

The average data are presented in Figure 6, The labeling boundary for 
stifliili with bursts attenuated by 20 dfl or more is represented by the filled 
squares. Stimuli with an unattenuated burst were uniformly identified as 
**stay,** and those with a *10 dfl burst received only 16 percent ''say** responses 
when no silence was cresent, so no boundaries could be determined for these 
stimuli. As expected, the boundary shifted toward increasingly longer values 
of silence as the burst was attenuated. Note that the boundary seemed to in* 
crease beyond the 35 dfl burst attenuation, althoug^i the difference between 
this condition and the burstless condition fell short of significance in a 
t*test. 

The discrimination (i.e.* burst detection) results for tne same stimuli 
are plotted in terms of percent correct as the filled circles in Figure 6, 
Performance was perfect for the original burst and declined with Increasing 
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burst attenuation* first slowly, and then more rapidly beyond 25 dB. For sti* 
tvjli with the initial Cslnioise, performance reached chance at the 35 dB 
attenuation. Note that the category boundary in the identification task con* 
tinued to shift beyond that point for at least some listeners, suggesting that 
subjects* sensitivity to the burst was at least as great in phonetic labeling 
than in auditory discrimination. This result provides strong evidence of the 
sensitivity of phonetic categorization processes to very subtle changes in 
acoustic information. 

Figure 6 also shows that burst detection was somewhat improved when the 
initial [s]-iioise was reooved, but only at the two weakest burst intensities 
(not a significant difference). Thus there may have been a slight auditory 
masking effect of the fricative noise on the burst, in agreement with Experi- 
ments 3 and 4, 



The pumose of Experiment 6 was to demonstrate a trading relation between 
burst amplitude and closure duration for the perception of a labial stop con* 
sonant* Labial bursts are weaker than alveolar and velar bursts (Zue» 1976)» 
and informal observations h^ve suggested that they are generally in3ufficient 
cues for stop manner* In other words, sooe closure silence is usually needed 
to perceive "sp»** even with the original burst in place* This raises tM 
question of whether labial bursts function as manner cues at all; perhaps* 
they merely add to the effective closure silence* Moreover, labial bursts 
offer the opportunity of observing not only effects of attenuation but also of 
amplif icatio||i* Would an appropriately amplified labial burst become a suffi* 
cient stop manner cue? 

The "slit"-"split" contrast was selected for the present study fo* sever- 
al reasons* First, it has be<»n used extensively in earlier studies (Bastian, 
Eimas, & Liberman, 1962; Dorman et al., 1979; Fitch et al*^ 1980; Harcu^, 
1973; Summerfield et al., 1981)* Second, a "p" tends to be heard in this 
context as long as there are no strong cues to a nonlabial place of articula* 
tion in the signal portions surrounding the "^ilwt closure interval. That is, 
listeners report "split** when separately prf \ced "s** and "lit** utterances are 
joined together with a sufficient interval silence in between (Dorman et 
al., 1979)* According to limited informal observations, the Ci] resonances 
following a stop closure, u ilike those of a full vowel, do not seem to ua^^tor 
any significant formant transition cues to stop manner and place of articular 
tipn, which makes the **slit**-**split** contrast different from the **say*****stay** 
contrast eEq>loyed in Experiments 1-5* This fact may be partially responsible 
for the finding (cf. Fitch et al*» 1980, and Best et al., 1981) that, in 
burst less stimuli, the typical **slit**-**split** boundary is located at much 
longer silent closure intervals (50-90 ms; for an exception, see Harcus, 
1978) than the **say**-**stay** boundary (10-30 ms). Differences in place of stop 
articulation and in phonetic environment may also contribute to this boundary 
difference, however* One reasc»n tor conducting Experiment 6 (as well as 
Experiment 7) was to see whether the presence of a labial release burst, am- 
plified to equal the power of an alveolar burst, might shift the 
"slit**-"split** boundary to the short silenoes characteristic of the 
"aay«*«atay'* boundary. 



Experiment 6 
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Hethod 

SttBtilt . A good token of **spllt** was selected from aeveral utterances 
produced by a female speaker and iraa dlgltlted at 20 ScHz. In the original 
utterance^ the initial CsJ-noise (105 ms) iraa followed by a silent closure in* 
terval (about 150 ma) and a **blit** portion consisting of an initial release 
burst (16 ms)» a voiced portion (about 230 ms)* a silent CtJ-closurey and a 
final [tj-release burst. The major energy of the'labial release burst was 
concentrated in the first ^ ns. The rms amplitude of these first ^ ns was de*^ 
termined to be about 14 dB below the [iJ maximum* and 20 dB below the III 
vowel maximum. The final 12 ms of the burst were about 13 dB bel6w its ini- 
tial 4 ma.^ 

Three additional stimulus versions were created either by amplifying or 
attenuating the l6-m3 burst by 12 dB, or by eliminating it altogether. The 
(actual) silent closure duration in each of the four vet'sions was varied from 
.40 to 100 ms in IOhiis steps* The resulting 28 stiimali were recorded in 10 
different randomizations* 

Subjects and Procedure . The same ten subjects as in Experiment 3 identi- 
fied the stinuli as **slit** or **3plit.** Because the author noted that some of 
the stimuli sounded like **stliii;** to hlji* this response alternative was provid- 
ed as well. Stimuli without any clear consonant between the **3** and the **1** 
were to be considered instances of **3lit.** 

Results and Discussion 

Since **stlit** responses were rather infrequent^ Figure 7 shows the com- 
bined percentage of **split** and **3tlit** responses as a function of closure 
duration and of burst conditions. Three results are evident. Firsts attenua- 
tion of the burst by 12 dB had a clear effect^ especially at the longer 
silences. Apparently^ hurst attenuation resulted not so much in a boundary 
shift as in a flattening of the labeling function* Second^ the condition in 
which there was no burst at all gave results very similar to the attenuat- 
ed-burst condition^ provided the no-burst function is shifted to make the 
nominal closure durations comparable across the two conditions. (The actual 
closure durations were U ais longer* as Indicated by the arrows in Figure 7.) 
This result is not surprising* given the initial low amplitude of the labial 
burst. Third* amplification of the burst by 12 dB had* surprisingly* no ef- 
fect at all. One side effect of the anplification seemed to be a tendency to 
hear ^'stlit** rather than **split*« in accord with recent data by Ohde and 
Stevens (1983) showing that burst amplitude is a cue to the labial-alveolar 
distinction. However* the present tendency was exhibited only by three of the 
ten subjects. A bias against the unfamiliar **3tl** cluster may have played a 
rC'le* 

The effect of burs*t attenuation or elimination demonstrates that labial 
bursts* too* have a function as stop manner cues* The absence of any effect 
of burst aiBplificaMon* however* suggests that the **slit*****3plit** boundary 
cannot be easily pushed toward shorter values of silence* Although, ^ne might 
have ei(«cted burst aiifilification to shift the boundary on purely psychoacous- 
tic grounds* it seems that the ao^litude increment was either ignored by 
listeners or channelled into decisions about stop place of articulation rather 
than stop manner. Thij curious and potentially iofiortant finding c3l?ed for a 
replication experiment. 
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Figure 7. Effects of labial release burst amplitude in Expe. Iment 6. Numbers 
refer to ampllf'Ude In dB relative to the origiifial burst. Closure 
durations in tne no-burst condition are nominal; actual durations 
are indicateu by arrows. 



Experiment 7 

This study was similar to Experimsnt 6» except for differences in stimili 
and the ranges of clo^^ure durations and burst amplitude values. 



Method 



Stimili . Good tokens of the utterances **slash** and ''splash** were record* 
ed by a different female speaker and digitized at 20 kHz. The fricative noise 
of **3la3h** (142 ms) was used in all stimili. The remainder (about 590 ms) was 
taken from **splash.** This portion included an initial 10«ms release burst. 
(The original closure duration was 66 ms.) The amplitude of the burst *^as de- 
termined to be 7.4 dB below the [13 onsets 11.9 dB below the vowel maximim (75 
ms later)» and 2.9 dB above the fricative noise maxiptim (120 ma after noise 
onset). Six stimilus versions were created by leaving the burst unchanged^ 
amplifying or attenuating it by 10 or 20 dB» or omitting it altogether. Each 
version occurred with (act'jal) closure durations ranging from 20 to 60 ms in 
lO-ma steps. The resulting 30 stimuli were recorded in 10 different randomi- 
zations. 
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Subjects aQd procedure . Th** same ten subjects as in Experiment 4 
participated. They identified the stimuli as "slash" or "splashy" with 
"stlash" as an additional option. To prepare the subjects for the amplified 
bursts^ the instructions mentioned that some of the stimuli might have "pops" 
in them» which were to be ignored. The data of one subject had to be discard- 
ed because of numerous response omissions. 
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Results and Discussion 

The results are shown In Figure 8* The left panel displays the labeling 
functions for the different burst anplltude conditions. In two respectSi the 
findings replicate the principal results of Experiment 6: Attenuation of the 
burst necessitated a longer interval of silence* whei*eas burst amplification 
did have the opposite effect; ratheri anpUfled bursts seemed to function 
like slightly attenuated ones. In two other respectSi the results are differ- 
ent frofi those of Experiment 6: The boundaries were considerably shorter 
herei and even tne 20^B burst attenuation condition still produced substan- 
tially more stop percepts than the burstless condition* These differences way 
indicate that the present release burst was a more powerful manner cue than 
that In the previous experiment. In addltio^i the different range of closure 
d^iratlonSp as well as other stimulus characteristic, may have contributed to 
the boundary difference. 



UJ 




Figure 8. Effects of labial release burst amplitude In Experiment 7. Left 
panel Is analogous to Figure 7| with legend provided by right pan-^ 
el. Right panel shows category boundary as a function of burst am- 
plitude. Actual silence duration in no*burSt condition is indicat- 
ed by arrow. 



The rlght<hand panel in Figure 8 summarizes the data by plotting the 
boundary location as a function of burst amplitude. It is plain that burst 
aiq>llf Icatlon did not continue the trend established by the burst attenuation 
results: As soon as the an^litude exceeded that of the original burst, its 
trading relation witu silence duration came to an abrupt end. How is this 
findin*; to be explained? 
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Only four of the nine subjects gave aiy ^stlat** responses. These re* 
sponses were fairly broadly distributed but tended to occur \*lth the higher 
burst anplltudes and at short closure durations. However* these weak trends 
observed in a few subjects are not nearly sufficient to explain the sudden end 
of the trading relation between burst anplltude and silence duration. 

A more relevant observation Is that* to the author (and presumably to the 
subjects as well)» the anpllfled bursts sounded like extraneous pops superlm* 
posed on the stimuli. This subjective liq>resslon suggests that anpllflcatlon 
of the burst destroyed its auditory coherence with the other signal portions 
and caused It to **strean off.** If so, It is particularly Interesting that 
subjects perceived these stimuli no^ as If they had no bursts at all, but 
rather as If they had a burst of **normal** aiq>lltude (see Figure 8). This 
finding thus seems related to two other Intriguing phenomena described in the 
literature: duplex perception (e.g., Llberman, Isenberg, & Rakerd, I98l) and 
phoneme restoration (e.g., Samuel, 1961). 

In duplex perception, a component of a speech stimulus Is heard as a sep* 
arate nonspeech event while, at the same tlme» it contributes to phonetic 
perception. Although the auditory segregation of the component is commonly 
achieved by dlchotlc channel separation, monaural duplex perception may occur 
when an acoustic cue, because of certain extreme properties, loses Its coher-^ 
ence with the rest of the stimulus (see also Hlller, Connlne, Schermer, & 
Kluender, 1983). The present experiment seems to provide luch an Instance. 
Its results are also related to phoneme restoration, which Is said to occur 
when a portion of a speech signal Is replaced with an extraneous sound without 
affecting phonetic perception. Samuel (1981) has shown that, for restoration 
to' occur, the extraneous sound must be a potential masker of the replaced por* 
tlon. Thus, the soH^alled phoneme restoration effect may, at least In part^ 
be a "cue restoration effect"; that Is, listeners fiu in mis«lng acoustic 
Information. A particularly relevant study was conducted by Pastore, 
Szczeslul, Rosenblum. and Schsuckler { 1982): A syllable-inltlal [p] in one 
ear was perceived as **t** when a noise bui st occurred In the other ear, but on* 
ly when the noise Included the frequencies typical of [t] release bursts. 
These findings combine aspects of duplex perception and cue restoration, as 
indeed do the present results. The aiiq)llfled bursts were, of course, the best 
possible maskers of spectrally Identical **normal** bursts, and because they 
segregated as **pops** from the rest of the signal, listeners were led to re- 
store the original burst perceptually. If this Interpretation Is correct, 
then the data provide a particularly Interesting demonstration of the detailed 
tacit knowledge of acoustic tor, perhaps, artlculatory) properties of speech 
that listeners possess and apply in the course of phonetic perception. 



General Discussion 



The present series of studies fills some gaps In our knowledge of the 
acoustic cues for stop manner perception. They uniformly show that the re* 
lease burst Is a highly Important cue for the perception of stops after [sJ. 

One result that emerges from the experiments Is th^t a natural alveolar 
release burst Is usually sufficient to cue perception of a stop In the absence 
of closure silence (Exps. 1 and 5), whereas a natural labial release burst Is 
ujually not sufficient by itself (Exps. 6 and 7). Although, In the present 
studies, alveolar release bursts were followed by pronounced vocalic formant 
transitions while labial bursts were not, preliminary observations Indicate 
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that the generalization holds regardleaa of following context* and .that velar 
release buriita are similar in salience to alveolar ones. The greater poHor of 
alveolar and velar bursts is» in large part» due to their greater amplitude 
and longer duration^ although spectral coin)03ition and/or different perceptual 
criteria for stops at different places of articulation may also play a role. 

A second result of the present research is that listeners are extremely 
sensitive to the presence of even very brief or severely attenuated release 
bursts (Exps. 1» 2» 5). Experiment 5 showed that» when labeling stimuli 
phonetically^ listeners are at lea: t as sensitive to the presence of such min* 
imal bursts as they are in a low-uncertainty burst detection task. As 
Mooteboom (1981) has pointed out» **phoneme identification seew to be an «x-- 
cellent way of measuring Just noticeable differences** (p. 149). This l3 not a 
trivial result » for it suggests that the perceptual criteria eiqployed In 
phonetic ident if ication are extremely stable and finely tuned » despite the 
high stimulus uncertainty prevailing in a randomized identification test. In-'. 
deed» preliminary data suggest that this stability and sensitivity is main* 
tained even in listening to fluent speech. The operation of stable criteria^ 
internal to the listener and presumably shaped by language experience^ is a 
hallmark of phonetic perception. Nevertheli)3S» these criteria must also be 
flexible to accommodate natural variability in speech^ such as might be due to 
changes in articulatory rate. In other words» the criteria are stable but not 
fixed; they are stable in the sense that their variability is not random but 
controlled by relevant factors. 

A third finding is that release bursts^ when shortened or attenuated in 
various degrees^ engage in a recular trading relation with closure duration* a 
second important cue for stop manner: The weaker the burst* the more silence 
is needed to perceive a stop. There are two contrasting hypotheses dbout the 
origin of such a trading relations It may either be phonetic or psychoacous- 
tic in origin. According to the phonetic hypothesis (see Repp* 1982)* the 
listeners' internal criteria specify the **prototypical** acoustic properties 
for the relevant phonetic segments* so that a reduction in one relevant prop* 
erty mst be conf^ensated for by an increase in. another property to maintain 
the same response distribution. (A similar prediction could be derived from 
the information integration model of Oden and Hassaro* 1978; see also Hassaro 
and Oden* 1980.) According to the psychoacoustic hypothesis* on the other 
hand* the principal cue for stop manner resides in the onset characteristics 
of the signal portion (which includes the burst) following the closure 
silence* and the role of the silence is to prevent a forward masking effect of 
the preceding fricative noise on the auditory representation of those 
characteristics* and/or to enabie the listener to attend to the critical onset 
properties* (This hypothesis is also congerial to the acoustic invariance hy- 
pothesis of Stevens and Blumstein* 1978.) 

The present results are not wholly inconipatible with psychoacoustic ex- 
planations* For exao9>le* the finding thai attenuation of the fricative noise 
resulted in a reduction of the amount of ;tilence needed for stop perception 
(Exps. 3 and 4)» but only when a burst was present (Exp. 4)» could be 
attributed to auditory forward masking. Effects of burst aoq^litude on stop 
manner perception also lend themselves to a psychoacoustic interpretation in 
terms of burst detectability. Data from other recent st:udie3* however* argue 
strongly against a psychoacoustic account at least of the role of silence in 
stop manner perception* Best et al* (1981) found that the trading relation 
between closure duration and the F1 transition for the "say*'-**3tay*' contrast * 
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was absent in nondpeech analogs of the atlmull. Repp (1983b) denionatrated 
that thl3 dame trading relation* aa well as that between closure' duration and 
burst amplltud« In **sllt**^**sr'llt»** waa restricted to the phonetic boundary re- 
gion but absent within phonetic categories. Perhapa the strongest result was 
recentl> reported by Paatore* Szcz«3lulf Sos«nblum» and Schnuckler (1983): 
When the [3]h)o1s« and the vocalic portion or ^^dlit**-**3pllt** tokena were 
dlffererttlally laterallzed* ao as to reduce peripheral auditory oaaklng and 
facilitate selective attention* the anount of closure alienee heeded to per^ 
celve **apllt** remained the same. Theae reaulta strongly favor a phonetic ac^ 
count of the Integration of acoustic cues In stop manner perception* without 
ruling out certain p .ychoacoustlc Interactions In the peripheral auditory sys^ 
ten that nay* for exaosple* affect burst detectablllty. 

Two findings were unexpected and should provide a atlnulus for further 
research. One result is that* apparently* burst aosplltude haa Its effect on 
stop nanner perception In absolute terma* not relative to the aif)litude of the 
following signal portion (Exps. 3 and 4). The role that potential stop manner 
cues In this voiced portion may have played needs to be examined in a more 
controlled fashion. The results may suggest* however, that Important stop 
manner cues reside In the first few milliseconds following the closure— that 
1S| In the absolute magnitude and slope of the sudden energy Increment. 

A second unexpected finding was the absence of a trading relation between 
anpllfled labial release ^bursts and closure duration (Exps. 6 and 7). This 
phenomenon was tentatively Interpreted as an Instance of ^'cue restoration**: 
The amplified burst was perceived as an extraneous **pop** and thus* Instead of 
functioning as a cue In the speech signal* assumed the role of a masker for 
the cue expected by listeners— viz. » of the **normal** release burst represented 
In listeners* detailed tacit knowledge of the normative acoustic properties of 
Speech. A relation may exist between this phenoiaenon and the demonstration by 
Pols and Schouten ( 1976) that burstless Initial stop consonants are more 
accurately perceived when preceded by pink noise (a potential maslcer of an a1^- 
sent burst) and Saouel^s (1981) findings on the role of ^bottom«up confirma- 
tion** In the phoneme restoration paradigm. 

In conclusion* the present experiments have yielded factual Information 
on the perception of a llttle-lnvestlgated cue as well as several Intriguing 
effects that should stlaulate further research. The results provide a modest 
challenge to psychoacoustlc theories of speech perception. Froni^ a 
pjychoacoustlc viewpoint* stop manner perception seems a much simpler problem 
than* for example* perception of place of articulation: All that may be In- 
volved is the detection of somtf^critlcal amount of energy increment or 
discontinuity In the signal. The eventual success or failure of psychoacous- 
tlc theories will rest* of course* on their ability to explain all kinds of 
phonetic perception* as well as to predict specific results from a model of 
auditory speech processing. Interesting work along these lines is now in pro- 
gress (Delgutte* 1980* 1982; Goldhor* 1983)* snd the present data* being rel- 
atively straightforward* may provide a convenient testing ground for new 
models of peripheral auditory processing. 
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Footnotes 

^Dorman et al. (I980) found that the presence of an alveolar release 
burst was not auff Iclent for perception of a stop In vowel-frlcatlve context 
(I.e., of an affricate, aa in ''ditch'*) In the absence of closure silence. 
While It Is difficult to generalize from resulta obtained with single tokens 
of natural speech. It la posalble that releaae bursts are more effective atop 
manner cuea In frlcatlve-vowel than In vowel -fricative envlronmenta. 

2 

Amplitude meaaurements were performed after redlgltlzlng the utterance 
without Preen^haala, ualng a program of the ILS apeech analysis system. The 
powerful appearance of the burat In Figure 1 Is in part due to hlgh-freqt'ency 
preemphasls. **Vowel onset'* refers here to the 20 ma of waveform linmedlately 
following the 20-1113 bur9t. The burst, as defined here, may have Included a 
first, extremely weak glottal pulse (between outpoints 5 and 6 In Figure 1). 
No attempt was made to distinguish between tranalent, fricative, and 
aaplratlve phaaes of the burst (see Fant, 1973). 

Playback amplitude waa not precisely calibrated but waa held constant 
within a few dB by maintaining a certain aettlng of the lewl control on the 
tape recorder (Aitpex AG500) for all aubjects. The peak amplitude of the vowel 
at the subjects' earphones (approximately 83 dB S?L) was eatlmated poatexperl- 
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mentally by converting the peak deflection of a vacuum tube voltmeter In re- 
sponse to the test syll^jfeles Into dB SPL« according to a chart prepared by 
Hasklns Laboratories technicians. 

4 

For no pa^'i.lcular reas^.^'t* the burst was excised rather than Infinitely 
attenuated. The latter procedure would have been preferable* but there are no 
' serious consequences for the Interpretation of the results. (The same applies 
to Experiments 6 and 7.) 

5 

The reason for the different effects of vowel attenuation In Experiments 
3 and 4 is not clear; they may have been due to different strengths of the 
stop manner cues In the vocalic portions used. In Experiment 3* no tendency 
to hear consonants other than **t** was noted* and a UOnns silence always yield- 
ed close 'CO 90 pe^^cent **stay** responses. In Experiment 4« on the other hand* 
a significant nur4ber of **svay** and **spay** responses occurred, and even wh«n 
the3e wer^ pooled with **stay** responses* the total percentage for burstless 
stJmull with a UO-ms silence was only 72. Therefore, the vocalic portion In 
Experl;ieient 4 seemed to contain weaker stop manner cues than that Jn Experiment 
3* and this may explain the different effects of attenuation. 

^Finally, the pattern of **svay** and **spay** responses may be considered 
(Table 1). Attenuation of the burst Increased both types of responses* 
sxmjxtaneously decreasing **stay** responses. Total elimination of the burst 
Increased primarily **spay** responses. There was also a consistent Vowel ef* 
fe'it* with both **svay** and **spay** responses being less frequent when the vo^ 
callc portion wf.s attenuated. Fricative a3V>lltude« on the other hand* had no 
effect on these responses at all. Closure duration did play a role (not shown 
in Table 1): **svay** responses decreased. as closure duration increased tn sti- 
muli with bursts* but Increased (high vowel amplitude) or remained constant 
(low vowel aDfilltude) In burstless stimuli; **spay** responses showed a strong 
increase with closure duration* provided they occurred at all (stimuli with 
low burst and high vowel amplitude* and burstless stimuli). The latter trend 
Is In agreement with earlier observations that long closure durations favor 
perception of a labial place of articulation (Bailey & Sumnerf leld« 1980)* 
**Svay** percepts* on the other hand* may have resulted from either 
'^misinterpreting** the burst as frlcatlon when the closure was short* or— In 
burstless stimuli— they may have taken the place of a possible **sthay** cate* 
gory* which Is difficult to perceive but corresponds to the Informal observa* 
tioh that burstless **day** portions often resemble **they.** In either case*. 
how*iver* attenuation of the vocalic portion favored **stay** over **svay** and 
**sp*y*** which Indicates a role of the vocalic onset envelope In this distinc- 
tion. 

7 

There Is a longHStandlng controversy* familiar from the literature on 
categorical perception (see Repp* l983a* for a review), about whether speech 
per'';eptlon experiments should be concerned with what listeners can do i'u an 
optimal situation or with what they do under normal circumstances. Auditory 
thresholds are often assessed In highly practiced listeners after many hours 
of training* No strong claim 13 being made here that these optimal thresholds 
coincide with the limit of burst effectiveness In phonetic Identification* al* 
though they obviously define a ^^wer bound. Rather* the hypothesis tested 
here concerns the 'but^st detection threshold for unpractlce<} listeners In a 
brief discrimination test* on the assumption that this threshold Is more like- 
ly to match the threshold of burst effectiveness in identification* In any 
case* the h/potnesls is that listeners* sensitivity in phonetic identification 
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Is no worse than In overt burst detection; if It is better, we would have ev- 
idence that the subconscious processes of phonetic identification maximally 
exploit auditory sensitivities. 

g 

Some subjects gave many **svay** and/or **spay** responses; the former oc* 
curred most often at Intermediate burst ainplltudos and sllences« the latter at 
low burst aint>lltudes and long silences* For the purpose of group boundary 
determination, these responses were grouped with **stay** responses. 

9 

Inspection of the unpreemphaslzed waveform suggested that a first, very 
low*aiiipl;tude glottal pulse may have been Included In the iHjrst as defined 
here* 

It seems likely that amplification of the burst by just a few dB would 
still have Increased its Power as a manner cue* However, the present data 
suggest that the trading relation with silence duration ends, well before a 
10-dB gain is reached. 
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A PERCEPTUAL ANALOG OF CHANGE IN PROGRESS IN WELSH* 



Suzanne Boyce^ 



Standard Literary Welsh exhibits a phenomenon known aa ^initial mta* 
tJloni^ in which a lexical item may retain the initial consonant of it5 cita* 
tion form or undergo one of three different rules that change its initial con- 
sonant by one feature. These rules ^are traditionally known as the SOFT, NA- 
SAL, and ASPIRATE mutations. The SOFT notation changes voiceless stops and 
liquids to their voiced counterparts, and changes voiced ^tops to homorganic 
voiced fricatives. The ASPIRATE mutation changes voiceless diops to 
homorganic voiceless fricatives. The NASAL mutation changes voiced and voice- 
less stops to naaals but maiTitains voicing and aspiration characteristics 
(Fynes-Clinton, 1913). Examples (1)-(4) below illustrate, in order, the words 
/pot/ 'pot' and /beik/ 'bicycle' in CITATION form, and SOFT, ASPIRATE, pnd NA- 
SAL toutations. 



(la) (CITATION) 

(2a) (SOFT) 

(3a) (ASPIRATE) 

(ila) (HASAL^ 



[a pot] 
The pot. 

[ei bet] 
His pot. 

[ei fot] 
Her rot. 

[v& m^ot] 
My pot. 



(lb) 



(2b) 



(3b) 
(ilb) 



[a beik] 
The bicycle. 

[ei veik] 
His bicycle. 

[ei beik] 
Her bicycle. 

[va oeik] 
Hy bicycle. 



The mutations are trigger^^d by a preceding word or a particular syntactic 
context rather than phonological environment. Triggering contexts are 
idiosyncratic and dissimilar; typical contexts for the SOFT mutation^ for 
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Instance, are after the word i_ *to,* adjectives after feminine nouna, and 

negative verbs that begin with [b], [dJt [g]. Strictly speaking, therefore^ 

conditioning for the mutations 1"^ neither morphological, syntactic, nor phono* 
logical, but something of all three. 

Thent is a certain amount of converging evidence that the ASPIRATE and 
NASAL mutations are used less and less frequently in the itpoken language. 
Jones (1977), for instance, states that ''the Aspirate Mutation after other 
words [than ei *her*] is rarely heard in spoken Welsh** (p. 105) and that 
Hhere is a tendency in many areas to use the Soft Mutation rather fthar the 
Nasal after ^n [Un*] (p. 331). The most detailed analysis of this trend may 
belong to Awbery (in press), who presents evidence from a number of Southern 
dialects that the SOFT mutation is gaining ground at the expense of the 
ASPIRATE and NASAL mutations. Thus, in environments where Standard Welsh 
would use the NASAL or ASPISATE mutation. Southern dialects may substitute the 
SOFT nutation. In addition, she notes a nunber of environments where the 
ASPIRATE mutation is dropped in favor of the CITATION form. Awbery states 
that dialects may differ ad to which environments and which lexical items un* 
dQrgo the change, and tb?i these changes are more common for younger than old* 
er speakers. Her examples for the (Standard) citation forms /ka:nol/, 
/kldwed/, and /kareg/ are given below. The mutation being applied is in 
parenthesis. 



(5a) (NASAL-Otandard form) 

(5b) (SOFT-Dialectal form) 

(6a) (ASPIRATE-Standard form) 

(6b) (SOFTwDialectal form) 

(7a) (ASPIRATE-Standard form) 

(7b) (CITATION-Dialectal form) 



[^Q Qha.nol iaur] 

ge:nol b %aur] 
In the middle of the floor. 
[ (ni) xl^wes i 3tm] 
[ glawes i 5im] 

I didn 't hear, 
[buru ^ x^^^^^ 
[bunj a kar^g] 
To hit with a stone. 



Awbery *s cl^im is - that although these changea show considerable variation 
among dialectic and speakers, there is a clear pattern of change in progress 
from 3 four-way to a two-way system. 

Given that such a change is occurring, the everyday experience of muta* 
tion for speakers in the South must be sotnewhat varied; that is, speakers 
must be accustomed to hearing both Standard and dialectal forms in the rele* 
vant mutation contexts. From the standpoint of any one speaker's experience, 
and regardless of whether the speaker's own grammar and productions are based 
on Standard or dialectal forms, the recognition system must anticipate alter* 
native possibilities for those contexts of NASAL and ASPIRATE Djutations in 
which substitutions may occur. In addition, overall, speakers must hear fewer 
instances of the NASAL ard ASPIRATE mutations than of the SOFT mutation ar.d 
CITATION forms. Presumably speakers are aware of this situation at some level 
of their internal grammar; that is, they must *know' that the NASAL and 
ASPIRATE mutation contexts are problematic. 

It is often hypothesized that iang^;^ge change coalesces around some point 
of vulnerability in the systeen (opao^t^, holt in the pattern, etc.). In this 
vein, it's interesting tt ^^f,*^ irt,i ev^^c is. V,^ndard Welsh, there are many 
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more contexts that require the SOFT mutation than the ASPIRATE and NASaL muta-^ 
tlons; an Informal count of triggering contexts listed In Jones (1977) 
reveals 2 for KASAL and 10 for ASPIRAT£« as opposed to 5t for SOFT. This Int- 
balance has apparently (Warren Cowglll* personal coomunlcatlon) always existed 
In the history of Welsh; although all three mutations have been steadily l^s^ 
Ing contexts^ the NASAL and ASPIRATE mutations have always been relatively the 
most iDfioverlshed. (Note« however* that both NASAL anO ASPIRATE mutations oc^ 
.cur In some very coanon phrases*-for Instance* the SASAL mutation after ^ 
*ln* and the ASPIRATE mutation after ei_ *her.' This may mean that If one fac- 
tor affecting vulnerability to linguistic change is frequency of usage* the 
partlc'ilar measure applied must be based on regularity of usage* or number of 
forms subject to the rule* rather than simple text frequency.) Thus* hlstorl* 
cal data as well as data from current productions In spoken Welsh suggest that 
the ASPIRATE and NASAL mutations are weakening. 

The results we present In this paper are focused on the state of the mu-^ 
tatlon system as a result of the change In progress in spoken Welsh. However* 
these data are derived from a series of experiments originally designed to 
.speak to a different Issue* that of the Internal structure of the lexicon for 
morphologically related words. Because of this separation between the orlgl-^ 
nal aim of the experiment and the way we look at the data here* only a very 
brief description of the experimental design and the theory behind It Is of- 
fered below. The entire series of experiments Is reported In detail In Boyce* 
Brouman* and Goldstein (In preparation). 

Briefly* the experiments Involved a method known as repetition Priming * 
This t^trhnlQUe relies on the fact tnat a subject who has heard or read a word 
recently will recognize It faster and more accurately when It Is presented a 
second time* that Is* the subject Is 'primed^ for recognition of that word. 
The effect has recently been manipulated to probe the organization of the 
lexicon for morphologically related words by testing which pairs of related 
words produce a priming effect (Stanners* Nelser* Hernon» & Hall* 1979)* 
Thus* our experiments were structured to measure priming between various forms 
(CITATIONt SOFT mutation* etc.) of the same lexical Item. 

Procedure 

Subjects listened first to c list of "priming*' word» and then to a second 
list of **target** words that were obscured by simultaneous random noise. 
Words Msed were mono^ and bl^syllablc masculine nouns beginning wltn a voiced 
or voiceless oral stop* ^nd were carefully balanced for number of (citation 
form) Initial /p/*/tAA/»/b/*/d/*/g/. All had stress on the first syllable. 

Each word was presented In a syntactic context that required a particular 
mutation and was recorded onto tape by a native speaker of North Welsh. The 
contexts were as illustrated In examples (1)-^(3) above with the addition of 

optional postpositions; (a) el o (SOKt MUTATION); (b) el hi (ASPirtATE 

HUTATIOK); and (c) ^ &ma (CITATION FORM for masculine nouns). The 

postpositions mean* In order* *of him** *of her** and *here* or *this^i (All 
three phrases are In current colloquial i^age. ) Subjects were told which 
phrases would occur and were disked to write each phrase In full If they could. 
Only full phrases with correct context and mutated form as well as correct 
lexical Item were scored as correct responses. Note that although in general 
the ASPIRATE mutation Is sjbject to dialectal substitution* the ASPIRATE muta^ 
tlon In the context "el (hi)" Is rigorously observed (Jones* t977* p. 105). 
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Presumably this Is due to contrast with the SOFT niitatlon context •'el 

(o).** To simplify experimental deslgOf the NASAL nutation was not used. 

Subjects 

The subjects were 60 native speakers of Welsh recruited through 
Welsh-sp«aklng clubs at the University of Bangor* Wales* and Cambridge Univer- 
sity* England. Of these* 34 were born and educated In North Wales* and 26 
were born In South Wales. (The major dialect boundary for Welsh runs between 
South and North Wales. ) Forty«eig^t of the subjects had experience with 
Northern dialect from living In North Wales; the other 12 (all born in the 
South) were accustomed to Northern dialect from radio programs and friends. 
None had any difficulty understanding the Northern pronunciation of the speak- 
er who made the tape. 

Results 

We present data here from two experiments. As noted above* both experi- 
ments w^^re set up to contrast different prime-target combinations for the 
three contexts used; however* for our purposes here the relevant comparison 
Is always* for Instance* all CITATION for^ means versus all ASPIRATE mutation 
means* all ASPIKATg: mutation means versus a11 50FT mttation means* ^nd so on. 

In Experiment 1* each form of a If^itlca* Hem (CITATION* SOFT mutation, or 
ASPIRATE mutation) was primed by thac lexical itea In the same form. Thl was 
contracted with conditions Ir which the target word was not primed. in all* 
eight different lexical Items (woras) vere used. Each was represented once In 
the appropilate form (CITATION* SOFT mutation, ASPIRATE nutation) In e^ch 
condition. The following tabl^ Lhows th^ re$iilts ar mean percent correct re- 
sponses to the t^irget form. 





PRIME: 


SKLF 


NONE 


T 








A 


CITATION FORM 


73 


46 


R 
G 


SOFT MUTATIOI) 


67 


It 


E 








T 


ASPIRATE MLTATION 


36 













Here we see that means for the CITATION form and SOFT mutation are nearly 
l<:entlcal In both the SELF and NONE conditions. This means that the CITATION 
and SOFT mutation forms behave similarly under both presentation conditions. 
In contrast, the means for the ASPIRATE mutation are considerably lower. 
(Analysis of variance snowed the difference between the three sets of means to 
be significant at the 1} level. A posteriori contrasts between pairs of means 
for the CITATION, SOFT mutation, %nd ASPIRATE mutation indicate this differ- 
ence is due to the lower ASPIRATE mutation means. There was no significant 
difference between means for the CITATION form and SOFT mutation.) This sug- 
gests that* even when subjects had been previously exposed to the same word, 
in the same mutating phrase, words to which the ASPIRATE mutation had applied 
were more often misperceived. This difference between SOFT and ASPIRATE muta- 
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tlons held for speakers born In both the South and Korth regions. (The 
Interaction between nutations and area of speaker was not significant. } 
Rescorlng In which all pnrases with the correct lexical Item were counted (re- 
gardless of mistakes In context or mutation heard) did not alter these re- 
sults. Thus* the weakness of the ASPIRATE nutations does not represent some 
'*blas'* on the part of subjects against reporting the ASPIRATE mutation* or 

against reporting the "el hi" *hers* context. Rather, actual recognition 

of the lexical Item Is iwpt'-^ In this context. 

The second experiment is essentially a replication of the first, for a 
larger word set and with the addition of two more prime-target conditions. 
The CITATIOK form was excluded. Thus» forms In botl. 2?itatlons were presented 
In the following conditions: (1) primed by themselves (SELF prlmine); (11) 
not primed; (ill) primed by the citation form (BASE priming); and (Iv) 
primed by the other mutat'.on (OTHER priming). The following table shows the 
data for the SOFT and ASPIRATE mutation under each of these conditions, again 
as mean percent correct recognition of the target mutated form. This time, 32 
lexical items were used. Again, each appeared once in each condition. 



prime; 


SELF 


NONE 


BASE 


OTHER 


SOFT 
MUTATION 


61 


t1 






ASPIRATE 
MUTATION 




27 


H3 


39 



As in Experiment 1, in all conditions those forms to which the ASPIRATE 
mutation has applied are poorly recognized compared to forms in which the SOFT 
mutation has applied. (Analysis of variance showed the difference between the 
two sets of means to be significant at the 3t level . ) Again, this pattern 
holds for speakers from both regions (the interaction of area by mutation was 
not slgnlf lcant)t and rescorlng again made no difference. 

Discussion 

Taken together, these results parallel the change in the mutation system 
documented by Awbery. Her evidence shows the weakness of the ASPIRATE imita- 
tion, «;s a rule that is being replaced by another rule, and suggests that the 
CITATIOK form and the SOFT mutation contrast with the ASPIRATE as lively, 
well^stabllshed rules in the gramnar of Welsh. The experiments described 
above show that this linguistic situation is reflected in (1) an egual proba- 
bility that the CITATIOK and SOFT nutation forms will be correctly identified 
and (2) a greater likelihood that forms in the ASPIRATE mutation will be mls» 
perceived or missed. This result is particularly striking because, as noted 

above, the context "el hi" is an extremely robust environment for the 

ASPIRATE mutation. Thus, the differential effect for the ASPIRATE mutation 
occurs in a context exeitq;>t from the change in progress. This shows that it is 
the rule itself, with all the contexts in which it applies, that is problemat- 
ic rather than one particular syntactic or morphological context. Further, 
speakers from both dialect areas show thl^ effect of decreased perceptibility 
for the ASPIRATE mutation. 
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These results are interesting for several reasons* First* of course, our 
experiment constitutes independent :ind etnpirical support for Awbery^s hypothe-*^ 
sis about mutation rule change in Welsh* More iEnportantly» our experiment 
shows that rule change in progress may be reflected in a tendency to confuse 
or misperceive input that is eligible to undergo the changing rule* We have 
seen that for all speakers, regardless of dialect area, any ASPIRATE omtation 
context is susceptible to misperception* This is clear because the mispercep- 

tion occurs eveit in the robust context *ei hi*^ 'hers,* which is not subject 

to dialectal substitution or change* It is not clear how puch this decrease 
in perceptibility for the ASPIRATE mutation is due to the subjects' experience 
of dialectal substitution in ASPIRATE mutatioi contexts, and how much to 
internal, grammar'^elate(^ factors that may have led to changing production 'n 
the first place* We know that (many) Southern speakers are accustomed to 
experiencing an unstable situation for the ASPIRATE mutation» but data on how 
much the experience of Northern speakers includes substitutions in ASPIRATE 
mutation contexts are currently unavailable* Production data from the Nor^^h 
parallel to Awbery's are needed to sort out these possibilites* It is possi- 
ble that a study of Northern dialects would raveal a similar pattern of change 
in progress* If so, then the interpretation of our data Is the same for both 
Northern and Southern speakers, i*e*, that the recognition system changes as 
production change$«in sokoe cases, as in our robust 'hers' context, it may 
even anticipate production for environments tha' are eligible to undergo the 
changing rule, but don't* On the other hand, if no such changes are reported 
ir Northern dialects, our expert Dents m^y have tapped the early stages of a 
change that has not yet emerged into production in the North* 
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Footnotes 

^More precisely, it voices [p],[t], CkJ, ti] and tr^], and spirantizes 
[b],[d] and CmJ* The fricative reflex of [g] was once realized as [-y] but has 
since disappeared* 

It has been shown (Kenpley i Morton, 1982) that target words that have 
bi>en primed are more readily recognized in noise* 
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An alternative explanation for these data based on differences In 
dlscriminablllty for fricatives and stops inay occur to the reader. Notice, 
however, that equal numbers of both phonetic categories appear In both inuta* 
tlons (e.g., Cbot] vs. [velk], Cfot] vs. CbelkJ), and that while words with 
Initial volcec Trlcatlves were somewhat better recognized than words with Ini- 
tial voiceless fricatives, words with Initial voiceless stops were better rec- 
ognized than worus with Initial voiced stops: thus, the effects "Should evftn 
out. In addition, a parallel experiment (not reported here) using words whose 
Initial consonants are never subject to mutation showed the same differential 
effect In ASPIRATE mutation contexts. This evidence Is examined In greater 
detail In Boyce et al. (forthcoming). 
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SINGLE FORMANT CONTRAST IN VOWEL IDENTIFICATION* 



Robert G. Crowder^ and Bruno H. Repp 



Abstract. Subjects rated ambiguous steady-state vowels from a con* 
tlnuum with respect to the categories /!/ ^nd /i/ (Experiment 1) or 
A/ and (Experiment 2). Each target was preceded, .35 sec ear* 
ller, by one of the following precursors: (1) one enapolnt from the^ 
target continuum^ (I) the other enipolnt, (3) the isolated first 
formant (F1) from (1), (H) the lsolaV;d F1 from (2), or (5) a hiss- 
ing noise. Although (3) and (4) did .lot sound as If they came from 
the target continuum* they produced reliable contrast in .both 
experiments. In the /1-x/ experiment, slngle-formant contrast was 
as powerful as from the full vowels. These results suggest a senso* 
ry, rather than judgmental, basis for the vowel contrast effects ob* 
talned. 

The occurrence of contrast In perceptual judgments alone single dimen- 
sions Is so coL^onlyplace It seems almost uninteresting. In judging shades of 
grey, heaviness of lifted objects, line lengths, loudness of tones, and so on, 
the perceived magnitude of one stimulus Is usually affected contrastlvely by 
another stimulus with which It is presented. A patch, of grey"" seems d^**k 
against a white background and yet the same patch seems light against a black 
background, for exan^le. The pervasiveness of coni^rastlve Interactions be* 
tween nearby stimuli should not, however, lead us to forget how Inqsortant It 
Is. Contrast allows the perceptual system to focus CnX what otherwise might be 
elusive differences. It hardly requires discussion that edge sharpening, in 
vision, advances the more Informative inspects of the visual world at the ex- 
pense of the less Informative aspects. 

The exapq)le of visual brightness contrast Is an Interesting one because a 
detailed neurophyslologlcal basis for It has been worketj out (however, see 
Gllchrl5t, 1977). Edge sharpening In at least simultaneous brightness con- 
trast follows Inescapably from verified rules of recurrent lateral Inhibition 
In the visual (retinal) system (see summary in Lindsay & Norman, 1977). Where 
does this leave us with other kinds of (.ontrast, though? It would be grandi- 
ose to^ apply the neural circuitry proposed for brlghtnass contrast to, say. 



*Also Perception & Psycho physlcs t In press. 
+AI50 Yale University. 
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contrast effects in Judging the conservatism of Supreme Court Justices. Some 
kinds or contrast* in other wi%rdSr might be more **cognltlve^ or judgmental and 
others more sensorV. (This comment need not seem' like an eruption of the 
mind /body distinction: Infants would display the sensory forms of contrast 
but not the Judgmental forms.) 

In fluent speech perception * most especially In unstressed* ^reduced^ 
vowels* the stimulus Information Is often liq>overlshed relative to prototypl^ 
cal category Instances. In a speaker's haste to get from one to another 
consonantal gesture* he/she often ^mlsses^ producing a vowel sound In anything 
like Its citation form. Veridical perception would be well served* In these 
cases* by a process of ^edge sharpening^ for the vowel perception system^ so 
that surviving acoustic stlimalus dlstlnotlons would be exag^^erated. Indeed* 
since the work of Fry* Abrati^on* Elmas* and Llberman (t962)* it has beep known 
that isolated vowels show oontrastlve context effects In Identification judg- 
ments. Sawusch. Nusbaum* and Schwab (1980) have distinguished three classes 
of explanation for the various Instances of vowel contrast that have been 
reported In the Intervening years. They discount feature detector fatigue as 
an explanation because ther^e are often substantial time lapses between context 
and target. We can also mention that retroactive contrast (Dlehl* Elman* & 
McCusker* 1978; Repp* Healy* & Crowder* 1979)— where the target comes first 
and the context second — effectively dismisses this first explanation. Changes 
In auditory ground and response bias are the two remaining classes of hypothe- 
sis. Sawusch et al. ( f980) apply these Interpretations mainly to contrast 
elicited In an anchoring paradigm and we need not follow these applications 
here In detail. It suffices to remark that the auditory-ground Interpretation 
appeals to sensory contrast In a way that is congenial with the analogy to 
visual b*^lghtness contrast* whereas response bias would very clearly be a 
^dgmental process. 

Some recent research on selective adaptation In the perception of stop 
consonants has pointed towards auditory-sensory explanat^lons rattier than to- 
wards Judgmental explanations. In the selective adaptation paradlgm» repeated 
preseni'Jtlons of an adaptor stimulus are shown to affect the perception of a 
subsequent test stimulus. The experimental operations Involved In measuring 
selecMve adaptatlcli are obviously a special case of contrast* several a^ithors 
having suggested a profound continuity of process between the two (Crowder* 
1981; Dlehl et al.* 1978; Dlehl* Lang* & Parker* 1980). In two such experi» 
ments* the authors were able to pit sensory (spectral) and Judgmental factors 
against each other. In one of these studies* Roberts and Sunarerfleld (1981) 
used an audio-visual adaptor In which an acoustic /be/ J^s synchronized with a 
visual /ge/. The coniblnatlon was Identified as /de/\r /5e/; however* Its 
effect on perception of a /be*<je/ test series was Identical to that of an 
unambiguous acoustic /be/. Thus* perception responded to the ^Tectral* and 
not the perceived phonetic* nature of the adaptor. In the othe, experiment 
(Sawusch & Jusczyki 1981)* an adaptor was made from a f rleatlve-stop^owel 
syllable /s^ba/ with 75 ms of silence between the two segments. Under these 
conditions* subjects call the adaptor syllable '^spa^ even though the 
stop-vowel portion* alone* Is unambiguously /ba/. In a /ba-pa/ test series 
following these and other adaptors* the **perceptual /spa/" but **acoustlc 
/s^ba/** affected responses Just the same way as did an unambiguous /ba/* again 
showing the spectral cue to prevail even In the face of coiitradlctory label-' 
Ing. 
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In their eiperlmental work with isolated vowels from the /l/*/i/ contlnu* 
um» Sawusch et al. (1980) reached the Interesting conclusion that both sensory 
and judgmental factors contribute to contrast^ but at different ends of the 
coptlnuum. Their technique Involved measurement of discrimination sensitivity 
aa well as Identification ratings. When /I/ was the context (anchor)^ there 
seeded to be genuine changes in the sharpness of the sensory system in the /I/ 
range of \,he continuum. When /i/ was the context, however^ the changes seemed 
rather to be In where people placed their response criterion. Thus, within 
the same stimulus continuum we may be able to see more than one form of con- 
trast operating. 

Experiment 1 

The present report extends the Sawusch et al. work in several ways» but 
It explores the same question of where contrast effects should be located with 
regard to the sensory versus response end of the processing machinery. In our 
experlnientt pairs of isolated .vowel sounds were presented In rapid (.35 sec) 
succession. The stlimill all came from a seven-Item /l/*/i/ contlnuur varying 
only In F1 frequency. (The stimuli used by Sawusch et al. varied In F2 as 
well as F1.) The second Item In each pair was the target (second through 
sixth Item from the continuum) and the flrat was either of the two endpolnt 
ltems» either of these two items with F2 and F3 removea» or a control (hiss). 
No response was required to the first Item In each palr» the context Item. 
These conditions were either mixed together randomly In a continuous series of 
trials or were presented In blocks. In the latter arrangement we should be In 
a position to observe the anchoring effects found by Sawusch et al. as well as 
''regular'* contrast betwt^en the two items In a pair. When the conditions are 
randomized^ however^ only palrwlse contrast should be observed. Our choice of 
the neutral hiss In the control condltlon^was considered: We wanted as close 
to a '*no contrasf* condition as we could get. Any tone or vowel» however 
unrelated to the test continuum It seemed » carried potential speotral or 
phonetic bias. The hiss served as a simple warning signal with no such bias. 

Because removal of F2 and F3 results In these Items* sounding u^llke tor 
kens of the /I/-/1/ continuum^ we can also offer expectations for which con- 
trast effects ought to be Influenced by whether the precursors are Intact 
three^foj mant vowels or not. If Sawusch et al. are correct In a.islgnlng con- 
trast produced by /I/ to sensory factors, we might expect that removal of F2 
and F3 would make little difference. For example^ Crowrler (1981» 1982) has 
proposed a theory of f requency^speclf Ic recurrent lateral Inhibition (see be^ 
low) that anticipates the same degree of contrast whether or not F2 and F3 are 
present. If contrast from the /i/ sxde of the continuum Is produced by other 
factors^ perhaps response bias or a range-frequency effect (Parduccl, 1974), 
then removal of F2 and F3 might alter the situation* 'ecause the tacit labels 
that , subjects might assign to the precursors (and that night engage the judg« 
menwl bias) would be foreign to the target con^'nuum. Furthermore* 

If a sort of adaptation "level mechanism contributed to con^.rast In the anchor* 
Ing situation CSawudch et al.» 1980), then we should expect more contrast In 
the blocked arrangement of conditions than in the randomized arrangement; 
this Is because In the blocked arrangement the same single precursor Is the 
first Item In each pair and therefore vastly outnumbers each of the six Items 
that can be the second Item In the pair. 
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Method 



Subjects . The subjects were 20 Yale undergraduate* serving either for 
pay or for course credit. 

Stimuli . The basic continuum of seven vowels ^as prepared on the Hasklns 
Laboratories Parallel Resonance Synthesizer. The Items were designed to range 
perceptually from /I/ to /i/ and varied only In F1 center frequencies (from 
279 to 381 Hz In roughly equal steps). F2 and F3 center frequencies were kept 
fixed at 2075 and 2780 Hz» respectively. These frequencies were ootqpromlse 
values between those typical of the vowels /I/ and /i/ (Petersen 4 Barney, 
1952). Three additional stimuli were uaed: (1) the /I/ endpolnt from the 
continuum with F2 and F3 removed (through options within the synthesizer); 
(2) the /I/ ent^'Polnt modified In the same way, and (3) a soft hlS3» which 
served, aa a control. All stimuli were 300 ms long. The vowels rose in funda- 
mental frequency from 80 to 100. Hz during the first 100 ns and declined to 85 
Hz during the last ^00 ms. The amplitude envelope was likewise shaped at the 
beginning and end of the syllable. In the stlimall with F2 and F3 removed, the 
amplitude of F1 matched F1 amplitude in the corresponding full vowels. Howev^ 
er, the overall amplitude was reduced by removal of F2 and F3. 

In .a preliminary experiment, 25 subjects were given single^ltem Identlfl^ 
cation tests on vowels similar to the vowels lackln^ F2 and F3 used In the 
present experiment. Other details of that preliminary experiment need not 
concern us: It Incluuded contrast coiq)arlsons similar to, but superceded by, 
those of the present experiment. Ncthlng In the preliminary study 
coiq>romlses, however, what we found later. Of Interest now Is that these 2? 
subjects were asked to listen to tokens of the various precursors In isolation 
and report what they sounded like, with examples of words containing the 
sounds. It has long been known that people can perceive and classify sln^ 
gle^formant vowels, and that lov frequency single formants are heard as back 
vowels (Oelattre, Llberman, Cooper, & Gerstman, 1952). When describing the 
/I/ endpolnt of the vowels described above, but with F2 and F3 deleted, 24 of 
the 25 subjects reported It sounded like /u/ (BOOT) and the remalnlng^ubject 
reported /^/. Given the /i/ endpolnt,' 7 responded with the same v^el (/u/), 
14 with the sound /o/ (BOAT), 3 with /a/, and t with /V, but never /i/. 
Thus, we may be assured that removal of the second and lilrd formants did In- 
deed drive labeling away from the /l/-/j/ continuum. 

The experimental tapes contained 100 trials' (pairs of vowels) apiece. 
Each trial Included the precursor, a J5Q ms delay, and then the target: afti^r 
the offset of the target, there was a 2.5 sec delay before the be^itnnlng of 
the next trial. After ever> 10 trials, there was a longer Intertrial delay (5 
sec) Intended to help subjects keep their places. 



Procedure . Subjects In the blocked condition received five 100-trlal 
tapes, one for each of the five precursor conditions. In different orders as 
determined by a Latin square. The 100 trials with a given precursor Included 
20 with each of the target vowels (nunibers two through six on the original 
continuum). Subjects in the random condition heard precisely the same 5OO 
trials, also in batchers of 100; however, the trials were completely random^ 
Ized, 30 that two adjacent trials usually had different precursors. 

In the first part of th3 session, subjects were played the /I/-/1/ con* 
tlnuum three times, in order, from number two through number six of the origin 
nal seven. They were told that the first item in the group is *'what we aro 
calling EE** and the last is *'IH.*' They all then listened to the first 10 trl* 
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als on the Random tape as practice, and then began the experiment proper. An*' 
swer sheets had arable numbers from 1 to 5 opposite each trial number. Over 
the left column (the 1's) the word BEET was spelled and ove>* the right column 
(the S's) BIT. The procedure was to rate the similarity of each target to the 
vowels In these two prototypes by circling one of the five numerals. 

Results 

;The main results are shown In Figure 1, in terms of mean rating in the 
^'IH", direction. The left panel shows ratings when the precursors were the 
full^vowel endpolnts from the continuum; the right panel shows what happened 
when F2 and F3 were removed from these stimuli. The Hiss condition Is drawn 
In both panels as a baseline. 
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Figure 1. The effect of five context precursors on the relationship between 
the position of a target vowel along the /i-i/ continuum and the 
'tendency to rate -it as /i/. The saroe control condition with a hiss 
as context is plotted in both panels. On the left, data are shown 
when the precursor was one or the other of the two series endpoints 
/i or i/. On the right are the results when the saiK precursors 
were used with all but the first formant deleted. (Experiment 1) 
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First of all, there was no result of blocking versus randomizing the 
experimental conditions and so Figure 1 combines these two conditions. In a 2 
(blocked/random) X 5 (precursor) X 5 (target position on the continuum) analy-^ 
sis of variance, blocking did not approach the .10 alpha level, either as a 
main effect or in Interactions. However, the same analysis of variance showed 
there were reliable differences among the five precursor conditions, 
F(i*,72) 2 7. lip £ < .Ot, differences that Interacted wHh target position, 
F(16,288) = 3.^1f £ < .01. As the figure shows, precursors had little or no 
effect on the relatively unambiguous targets^-numbars 2 an'l 6 trom the contln- 
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uuB. Of course, the position on the continuum of the target itself had a 
reliable main effect on ratings of '^IHHiess,'* F(*l,72) = 120.06, ^ < .01. 

Cortrast was asymMtrica:! from teh tHO ends of the vowel continuum, to 
say the least; There was none when /i/ or its single-foraant version were 
used as precursor?. However, the /i/ precursor quite obviously made the tar- 
gets sound more like /i/, and this was true with or without F2 and F3* The 
next question is whefner the overall degree of contrast was different for the 
full vof^eU (left panel) versus the vowels with F2 and F3 removed (right pan-* 
el). The best single measure of the contrast effect is probably the differ- 
ence between the /i/ and iif prev^ursors (or their modified versions)* In a 
new analysis of variance, the control condition was dropped, and the factors 
were ( 1 ) blocked/random, (2) full/altered precursors, ( 3) /i/ versus /i/ 
precursors, and (4) position of the target on the continuum. There was a 
reliable main effect of whether the precursors were altered or not, 
F(1,18) s 5,25, £ < ,05, reflecting the fact that the full vowel precursors 
(left panel of Figure 1) led to somewhat higher /i/ ratings whatever the iden- 
tity of the precursor (/i/ or /i/). However, the identity of the precursor 
had a large main effect, F(1,18) s 13*91f £ < .01, and it did interact 
with wnethe/ precursors were full or altered, F < 1. Thus, there was no evi- 
dence that full vowels exert a greater contrast effect than vowels with F2 and 
F3 removed. There was a statistically significant interaction between these 
two factors and the position of the target along the continuum, 
F(«,72) 5 3.23, £ < .05. This interaction reflects the fact that the full 
vowels had their effects exclusively on the fourth and fifth continuuc posi- 
tions, whereas the contrast produced by altered vowels occurred at all contin- 
uum positions except the last. 



Discussion 

\ 

^ The main finding of the experiment is that contrast was obtained even 

i after F2 and F3 were removed from the endpoint vowt^ls, rendering thelh phoneti* 

' cally foreign to the continuum being Judged. The degree of contrast was not 

\ even changed by this operation. It is true that the contrast effect "bulged" 

differently with the full vowels than with the altered vowels, as revealed by 
the significant three-way interaction identified in the previous paragraph. 
We defer comment on this finding until after reporting the second experiment. 

Another finding was the asymmetry in contrast across the /i/*/t/ continu- 
um* Whereas iSawusch et al. (1980) observed anchor effects from both ends of 
this continuum, and later were able to assign them to different mechanisms, we 
sicDply got no contrast at all froa /i/. At the very least, the asyujietry of 
this continuum in contrast tells us there is more at uork here than a single 
Judgmental bias leading people to assign contrasting labels to precursor and 
target. Such a bias would result In symmetrical effects. In fact, if we ac- 
cept the Sawusch et al. analysis, our results suggest that Judgmental bi^s 
(associated with the /i/ context) 3iDq)ly did not occur in our experiment. 

It made no difference whether conditions were blocked or mixec* randomly 
across trials. This is ccmforting in that it means there is c:^e less choice 
to worry about in designing experiments. It was disappointing in the context 
of this experiment, however, for if contrast from /r/ precursors were a conse- 
q::ence of Judgmental bias, one might have expected blocking to make a differ-*^ 
ence, and differentially for the full an<* tie altered vowels. One possibility 
is that Judgmental bias was not engaged in the pairwise, precursor-target tri- 
al arrangment because the precursor never required an overt l&oeling response. 
In tne anchoring literature, all itemA are identified in sequence. Perhaps 
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requiring people to label both of the two items in each pair (as in Repp et 
al.» 1979) would have been sufficient to nake blocking a more interesting 
variable. In addition, the short interval between the stinuli in a^^ir (350 
ms) may have discouraged subjects from assigning covert labels to the precur- 
sors. 

Single-formant contrast is predicted by Crowder *s (1978, l98l, 1982, 
1983) theory. The hypothesis is that vowels are represented in auditory memo* 
ry in some form similar to a sound spectrogram. When two tokens are together 
in auditory memory» they show f requency^specif ic lateral inhibition; that is, 
where ,the two sh%re formant en^gy, they mutually weaken each other's* 
representation. This process is %hown schematically ip Figure 2 for an 
/a/-/*/ pair. 

The inhibition has no effect on vowel quality when formants match from 
the two vowels. However, when formants partially overlap, as in the illustra* 
tion of Figure 2 or in F1 of nearby members of the /i/^/i/ continuum, the 
intersection region will be inhibited in both. This leaves the most extreme 
regions of the intersecting formants intact, giving them more extreme formant 
oenter frequencies after inhibition than they had before. The absence of con* 
trast from tokens is, of course, bz baffling to this model as it is to 
most others. At this point, we deci''ied to replicate our experiment with an* 
other vowel continuum, in order to see whether the^e findings had any 
generality. 

Experiment 2 ^ 

The second experiment used vowels from the /c/^/ae/ continuum and dropped 
the comparison of randomized and blocked conditions, using blocked presenta* 
tion only. Otherwise, it was nearly identical to Experiment 1. The purpose 
was siiiq)ly to generalize to different subjects and different vowels the 
occurrence of single-formant contrast. 

Method 

Subjects . The subjects were 30 Yale undergraduates, participating in re* 
turn for pay. 

Stimuli . Another seven*vowel continuum was prepared on the Haskins Labo* 
ratories Software Serial Synthesizer. (No parallel synthesizer was available 
to us at the time.) The vowels were designed to range perceptually from /^/ 
to /^/ and varied, this time, both in F1 and F2 (respectively, from 530 to 660 
Hz and from 1840 to 1720 Hz); F3 was fixed at 2480 Hz. The response alterna- 
tives on- the e:(tremes of the fine«i>oint rating scale were the words BET and 
BAT. Low-pass filtering (cutoff frequency = 800 Hz, rolloff ^ 48dB/octave) 
was used in order to produce endpoint toke^js of and / / with F2 and F3 
deleted. These altered versions of /t/ at^d /%/ sounded to us unambiguously 
like and /V, respectively. The same liiss was used as in Experiment 1. 
The items were all 260 ms long, the ISI was ,35 sec, and the delay between 
trials was set at 3,5 sec. In all other procedural details, this experiment 
was identical to tY ^ blocked condition of the previous one. 

Results 

The results are shown in Figure 3, which is organized identically to Fig* 
ure 1. Evident in the figure are several findings: (1) Contrast from the /^/ 
direction on the vowel continuum occurred both for the full vowel precursors 
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PREDICTION OF fHONETIC CONTRAST \h VOWELS 
FROM FREQUENCV^SFECIFIC I.ATERAL iNHIBITtON 
IN AUDITORV MERMRV 
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Figure 2. An illustration of frequency-specific lateral inhibition could 
produce phonetio contrast in vowels. See text for explanation. 
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Figure 3. The' saote as Figure 1'except the data are for ar /e -»/ continuum 
(Experlrinent 2). 
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and for the single formants^ (2) this contrast waa larger for the full vowels, 
however* than for the single formants, (3) there was no trace of contrast from 
the /a/ direction, and (4) there was again a tendency for contrast to "bulge" 
in the most phonetically ambiguous region of the target items when the full 
vowels were !}recursors» but not when the sinjE'le formants were used. ^ 

These observations were confirmed by several analyses of variance on mean 
ratings (toward /a/). In the first of these, all five precursor conditions 
were crossed with ^he five target vowels. Both main effects and the interac- 
tion were statistically significant; for conditions, 116) s 13.32, 
£ < .01; for target vowels, F(4,n6) = 43.46, £ < .01; and for the interac- 
tion, F( 16,464) 3 4.49, £ < .01. In the next analysis, the hiss condition was 
dropped* leaving a tvo-by-two design with respect to the precursor (full vow^ 
els versus F1 only X direction of contrast— from /e/ versus froo /a/)* The 
five target vowels were cooq^ared in the third factor. There w^re statistical-^ 
ly significant main effects of whether the full or altered vowels were used as 
precursors* F(1*29) s 6.33» £ < .01; of the direction of contrast, 
F(1«29) 3 47.12, £ < .01; and of the placement on the continuum of the tar- 
get# F(4,116) s 43=41. Given the uniform results of precursors from the /a/ 
direction, the first of these main effects means the fuJl vowels produced more 
contrast than the first formants alone. The reliable interaction between 
direction contrast and the five target vowels, F(4,116) ^ 7.8l# £ < .01, 
indicates that with the full and altered vowels combined, the interior 
(ambiguous) vowels were more affected than those closer to the endpoints. The 
three-way interactirn between precursor type (full versus altered), direction 
of contrast « and vowel, was statistically significant here, as in the previous 
experiment, F(1,116) s 3.96, £ < .01. This interaction is the most direct 
verification of the ^'bulging*' in contrast effects for the full vowels but not 
for the altered ones. 

The main empirical goal of tnis arMcle is the establishment of sin^ ^ 
gle-formant contrast. Therefore, since single-foroant contrast was smaller 
than full vowel contrast in this experiment* one additional analysis of vari- 
ance was performed, including only two precursor conditions, the hiss control 
and the sin gle-formant alteration of /e/. In this analysis* the main effect 
associated with this coiq>arison was reliable at the .01 level of confidence* 
£(1,29) 6.66. The position of the target vowel was of course also a reli*' 
able source of variation, F(i»;il6) = 41.29{ p < .01, but the interaction was 
less than 1.00. Thus, botli experiments require the conclusion that sin* 
gle-formant contrast can occur on a vowel continuum even when these precursors 
do not resemble phonetically the vowels targeted for identification. 

J' 

General Discussion 

The two studies are so very consistent in most respects* we should deal 
with the single major discrepancy first: In Experiment t* the total amount of 
contrast was no larger for the full vowels than for the single-formant precur- 
sors, but in Experiment 2, contrast was .arger for the full bowels* One 
difference in the target stimuli used in the experiment may be critical here. 
In Experiment 2, there was variation between /£/ and in both of the first 
two formants, whereas in Experiment 1, only the first formant varied along the 
continuum. Methodologically* this discrepancy appears at first inexcusable 
but in fact was uesirable to get ^'prototype** exemplars of both phonetic c^ate* 
gories (see Peterson & Barney* 1952). The consequence is that in Experiment 
2* an ainbipious item on the /^/*/^/ continuum was receiving potential 
contrastive infU^ences froou both F1 and F2 in the. case of the full-precursors* 
but only from F1 in Experiment 1. It will be straightforward to untangle 
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the5e faccora in future research should anyone ever be Interested in whether 
slngle-f ormant contrast Is numerically equal to regiilar contrast along contln^ 
ua varying In only one formant. 

In both experiments, the full vowels produced markedly more contrast for 
the more ambiguous tokens than for those from near the endpolnts (the 
"bulge"), whereas the altered, slngle*f ormant vowels produced relatively unl* 
form coutrast across the entire target continuum. We are Intrigued by this 
result* but can offer little guidance in its Interpretation. An obvious 
possibility for the locus of an Interpretation Is In some phonetic process. 
The Identification targets that receive the roost contrjstlve Influence from 
the full vowels are those that\are most ambiguous phonetlcallyl Correspond- 
ingly* what distinguishes the full vowels from their slngle-formant variants 
Is that they carry clear phonetic Information about the relevant continuum. 
Somehow the richness of the phonetic Information In the precursor could com- 
bine judgmentally with the precarious Initial classification of the an^lguous 
target. But such a simple appeal to Judgmental bias will obviously not work, 
because there Is no corresponding selective effect of the putative phonetic 
process when contrast Is measured from the "wrong" direction (that Is, from 
either /^/ or from 

■- No contrast .was obtained In either ^tudy when one continuum endpolnt was 
used and abundant contrast was obtained when the other was used. What Is the 
principle responsible for these asymmetries? With data on only two contlnua, 
we would be fool^lsh to propose a general hypothesis. We are now followf.ng 
several theoretical possibilities experimentally. The main burden of this pa- 
per Is not the asyro<t)etry in contrast observed here and elsewhere, however. In 
both experiments, we found unambiguous evidence that slngle-f ormant precursors 
affect ambiguous vowel Identification. This much was predicted by Crowder*s 
(1978, 1982, 1983) theory. There may well be other theories that predict sln- 
gle-formant vowel contract and so we shall not stress the confirmation of 
these results tor that one particular prediction. More to the point* the sort 
of theory that assigns contrast. In this situation to specific, sensory proces- 
ses is advanced by slngle-formant contrasts If subjects were trying to "bal- 
ance out" their use of the response categories between their Internal naming 
of the precursor and their explicit rating of the target, the slngle-f ormant 
precursors should have been much less effective. 

One caveat needs to be^ added: Removal of F2 and F3 In Experiment 1 made 
/I/ sound like /u/ and /:/ sound like either /u/ or /o'. Kow /I/ and /u/ 
share the feature of being high vowels, whereas /i/ and /o/ are both 
articulated with the tongue In' a lower position. If subjects in Experiment 1 
heard the altered vowels as an /u/-/o/ contrast (1^ of the 25 subjects in the 
preliminary Identification test did) and If they were somehow sensitive to the 
*ilgh*low feature. It Is possible that they applied a Judgmental bias with re - 
spect to that feature. That Is, If they heard what they thought was /u/ as 
the first member of the pair, followed by an ambiguous token between /I/ and 

they might be biased to pick the lower tongue-posltlon alternative, /i/. 
A similar argument can applied to /^/ and In Experiment 2, which might 
engage both the hlgh^iow and front-baok dimensions. We cannot dismiss this 
possibility on the basis of the present experiments. However, this sort of 
process would encounter Just as much difficulty with the asymmetry of contrast 
along the various vowel contlnua as do other notions. Also, in Experiment 1, 
where we have data on identification of the precursors, oiilx about half of the 
subjects would have b^en expected to identify the single formant vowels as ^n 
/U/-/0/ contrast* If we believe the 1V25 ratio of the prellmipary experiment. 
Furthermore, this alte**natlve explanation requires considerable abstraction of 
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distinctive vowel features, which would be important if true, but reniain5 
highly speculative now. And finally, the Fl dimension in vowel space is very 
highly correlated with tongue height, so that the tongue-height dimension is 
Just another level of discourse in which one can talk about ri frequency. 
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INTEGRATION OF MELODY AND TEXT IN MEMORY FOR SONGS* 



Mary Louise Seraflne,^ Robert G. Crowder,,^^ and Bruno K. Repp 

Abstract. TWo experiments examined whether the inemory representa* 
tlon for songs consists of Independent or Integrated components 
(melody an^ text). Subjects heard a serial presentation of excerpts 
froia largely unfamiliar folk songs, followed by a recognition test. 
The test required subjects to recognize songs, meloJles, or texts 
and consisted of five types of Items: (a) exact songs heard In the 
presentation; (b) new songs; (c) old ^unes ulth new words; (d) 
ntw tunes with old words; and (e) old tunes with old words of a 
different song from the same presentation {"mismatch songs"). 
Experiment 1 supported the Integration hypothesis: Subjects ' 
recognition of cocnponents was higher in exact sor^s (a) than In 
songs with familiar but mismatched cotrponents (e). Melody reco^r*! 
tioni In particular, was near chance unless the original words were 
present. Experiment 2 showe<i that this Integration of melody and 
text occurred also across different performance renditions of a song 
and that it could not be eliminated by voluntary attention to the 
melody . 

Introduction 

/ Song is a un^ ersal artform that consisv.s of two seemingly separate com- 

ponents, melody ar.d text . In practice a sonr (nay derive from a pren^omposed 
melody to which words are added, or from a pre-existing tev"^ later set to mu** 
sic. In fact a song may be the work of two artists, a con^oser and a poet or 
librettist. Yet the relationship between melody and text raises Interesting 
questions In the domains of both aesthetics and cognitive psychology. 

One of the aesthf^tlc Issues Is how the artform should be defined: wheth* 
er It i? simply a pairing of Independent components or an integral whole that 
transcends Its parts. This Issue has Implications for the analysis of songs 
from a znuslc-theoretlc viewpoint — for example, whether the components can be 
considered or analyzed separately. 

A parallel Issue can be raised from a cognitive viewpoint: To what de* 
gree are melody and text Independent or Integrated In perception and memory? 
Whi^e there Is substantial literature both on linguistic memory and on musical 
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Dienjory (Deutsch, 1969; Dowllng, 1973* 1978; Dowllng & Fujltanl, 1971), re- 
search thus far does ' not indicate hov a hybrid form such as songs might be 
represented in menory. Indeed, research on hemispheric differentiation* espe- 
cially that which suggests left*4iemisphere dominance for lanfruage and 
rlght*4ieirl3phere dominance for music (e.g.* Best* Hoffman* & Glanvllle* 1982; 
Klmura* 1967)* leaves entire}/ open hov melody and text In songs might be pro- 
cessed. 

Our Interest In this Issue was generated by Infor^^l observations sug- 
gesting that* In nieDory for songs* melody and text form an Integrated unit, 
0uch that people find It difficult to separate the tvo conq>onents. For exam- 
ple, If asked to recite the words of their national anthem* many people woulti 
have to sing the song* or at least rehearse it subvocally* in order to 
generate the words. Also* people may not Imcnedlately recognize that two dif- 
ferent songs have the same melody If their texts are different. "Twinkle, 
Twinkle, Little Star" and ^Baa« Baa« Black Sheep" are a case in point* where 
Idp-^'lcal melodies are part of what are considered entirely different songc. 
Ye&ton <1975) provider the exaav>le of the well*known theise of the Mozart C ma- 
jor piano sonata (K. 545)« which (with slight changed In rhythm) Is rarely 
recognized as the melody of "Hey There, You With the Stars In Your Eyes." Fi- 
nally* the first author has found Instances of profound melody/text Integra- 
tion In loformal experiments with a young child. In these experiments a 
twcyear^ld* who could repeatedly Lad accurately perform a lat;ge body of 
songs* was nevertheless Incapable of singing the melodies on the syllable "la" 
without the words. Instead* she singly spoke the syllable In rhythm. Simi- 
larly* she was either unwilling or unable to repeat the words without the mel- 
ody. 

These exanq>les argue for some form of Integration of melody and text In 
memory for songs* although It Is also true chat adults* at- least* can 
voluntarily separate a melody from Its text and vice versa In singing and 
recognition. Thus In theory the memory representation for songs might consist 
of: ( 1 ) Independent coioponents , (2) Integrated components* or (3) d 
* non^decomposable whole (an extreme form of integration). If melody and text 

were stored as Independent coi^onents* we itould expect that inemory for songs 
could be predicted by the Independent prob;jbllltles of memory for melody and 
memory for text** On the other hand* if the coiLponenta were Integrated* we 
would evect that memory fc: one conq>onent facilitates memory for the other. 
Finally, If songs were stored as non*decoii$)osable wholes* we would expect that 
melodies cannot be recognized as familiar when their "^ords are different, and 
vfce versa. This last hypothesis Is clearly false In many situations: Words 
are easy to recognize In new contexts* and most people can probably recognize 
a tune when the words are different If the tune Is pointed out to them. 
(Musicians and experienced listeners can often do so in any ca?e.) Nevertlje- 
les5* It Is worth Investigating the degree to which a whollstlc representation 
may characterize novel songs* nhen attention l3 not explicitly drawn to 
tune-similarity. 

Mote that the Issue of Integration can be distinguished * at least 
conceptually* from two related isiues: c ompatibility and assocjij tl p rii. 
Melodies and texts are often compatible rhythmically in that hljber pitches* 
longer duratlon3* and muslca3HDetrac stressed tend vo occur on jcoente^J syll- 
ables. Similarly* melodies and texts may be compatible "semantlcally^ in that 
the teDq>o and musical mood sc^em to fit the meaning of the words. However* it 
is possible that a cognitive form of integration occurs irrespective of the 
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coaq;»atlblllty of oomponents. Indeed, whether compatibility is neci^ssary or 
sufficient for integration is an en^lrlcal question not under consideration in 
the present experiments. 

Also, integration of melody and text in song can be distinguished from 
association as me*^*! knowledge of co-occurrence. That melody and text co^occor 
is undeniable. Yet association oay occur without integration. Indeed* It is 
possible to imagine other co^ccurring events (e.g., speeo^i and background mu- 
sic) that do not give rise to integration. 

The purpose of these experiments, then, was to investigate the degree to 
which melody an<i text are independent* integrated, or nonseparable (^'^holis- 
tic*') in memory. for, songs. 

Experiment 1 

Subjects heard 24 consecutive excerpts from folk songs, followed by a 
20-item recognition test. The tist items were of two types: (1) excerpts 
that had beea heard in the preseritation Cold songs") and (2) excerpts that 
had not been heard in the presentation (^new songs^). Further, new songs were 
of fo*Jir types: (a) new tune with new words; (b) old tune with new words; 
(c) new tune with old words; and (d) old tune with old words that had been 
sung to a different tune in the original presentation (^tune and words inis- 
matched"). In the remainder of the paper, the terms ^tune^ and "words, ^ as 
used with subjects, are interchangeable with **melody** and **text.^ 

The main prediction was that, if subjects integrate melody and text in 
memory, they should recognize previously heard melodies or texts more 
accurately when they are pe^'ired with their original companion (text or melody) 
than when they are paired :^ith a different companion. On the othei* hand, it 
melody and text are stored as indepenaent components, then subjects should 
— peo^gnize previously heard melodies or tex*;? equally well, whether paired with 
the same or with a different companioi;. Finally, if songs are stored as 
wholistic units, then subjects should not be able to recognize melodies (or 
texts) at all, exci'.pt wtien thay are paired with their original companion. 

Method 

Materials . Songs that we considered unfamiliar to the average listener 
were drawn from a collection of indigenous American folk songa. conipiled by 
Erdei (1974). Twenty pairs of f xcerpts with interchangeable melodies and 
texts were chosen, each excerp'c consisting of the opening two to four measures 
of a song. (See list in appendix.) Thus each pair of excerpts yielded four 
different songs* a total of 30. Figure 1 shows a ,saii9>le pair of interchange- 
able melodies and texts, examples of the five test-item type, are ^own in 
Figure 2. 

In seme case5 minor alterations were made to the oricinal melody or text 
to ensure a rhythmic fit with its companion. (See appendix*) For exarple, 
"across** from cne original text was changed to ***cross** in 'Hir experiments 
(Figure 2, test item a), however, in all casss the texts and melodies were 
identical across carall'el presentation and test versions of a song. 
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Melody 
A 



T«iit 



a 
b 



i- 



When tbe 
Hush «- 



train comcB ii**tong. Wli«n the train cones long, 
byet don^t you cry, go to steep tit-* tte bahe. 



a 
b 




Hush a- bye. 4on't you cry, go to e\eep llt-tte Habe- 
When the train cones a*^ tone*^en the train cottes a-long. 



Figure 1, Sample pair of songs with Interchangeable texts. 

original sonf^; Ab and Ea denote derivatives.) 



(Aa and 6b denote 



UVtrU PttSCHlAflON iXtftS UMKE TESf IfCMS 




liar. HM. . ¥lt* i^iflbih tK*- - vln- 4t.t •HI w 4a vtU ih» tU ^\ hU* -i 





* > . d 



Vh4*. f.p-^Plnl .t tit. .la. tttn^ 



..hin-.r ^ILh «• bur * chin-tv d«LI. 



Figure ?. Sample prjjsentatlon and test Items, (a: new tune, new wordr,; bi 
old tunc, new words; ci new tune, old words} d: old tuno, old 
words-^lsmatched ; e: old song) 
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The excerpt3« 3ung by a tenor with vocal tralnlngt were r^^orded on tape. 
They were 3ung a3 notated* except tran3po3ed down a fifth (or twelfth) to the 
tetior range* and at a tempo of one beat per 3econd (MM; s 60), Th^ excerpt3 
varied in key and node* but were notated with G a3 the tonic in each ca3e. 
The tapes were recorded nWlth a 5-3ec Interval of 3llence between pr^3entatlon 
items and a 10-5ec respon^'^e Interval afti^r each te^t item. 

Design . From the bank of 80 song excerpts* five parallel sets of presen- 
tation and test sequences were constructed. Each set was administered to a 
different group of subjects. 

In the presentation sequences '(24 items)* half the excerpts were tunes 
with their original words (type Aa or Bb in Figure 1}* and half were tunes 
with words borrowed from their companion song (type Ab or Ba in Figure 1). 

In the test sequences (20 items)* each of the five test item types (a 
through e in Figure 2) occurred fcur times. Moreover* across the five subject 
groups presentation items were paired with each of the five possible test item 
types* following a Latin square design. For exan^le* Table 1 shows the 
generation of possible presentation and test items from two of the song pairs. 



Table 1 

Presentation and Test Items From Sample Song Pairs 



SUBJECT GPOUP 



SONG PAlK Pres. /Test 



[Aa and Bbjy Aa Ba 



II 

Pres,/Test 
a 

Aa Bb 



III 

Pres . /Tes t 
b 

Aa Ab 



IV 

Pres. /Test 



Aa Ab 
Bb 



Pres. /Test 



Aa Aa 



[Aa and ^bJ^ Ab Aa Ab Sb Ab Ab Ab Ba 



Ab Aa 
Ba 



a 




new 


tune* 


new 


words 


b 




old 


tune* 


new 


words 


c 




new 


tuije, 


old 


words 


d 




old 


tune*^ 


rid 


words* mismatched 


e 




old 


sons 







Two exan:ples (X* ¥) from 20 song pairs. 
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As shoirn in Table t» a mismatch test item (type d: old words paired with 
old tune of a different song) required two presentation excerpts. Whenever 
two such Items were required In the presentation, they Iramedlately fcllowed 
each other on the tape. Thus each presentation sequence required 4 pairs of 
songs for the tnlsmstch test Items* plus t6 songs for the other test Item types 
(total of 24 songs), 

In all presentation and test sequences, the excerpts were generated 
successively from Song Pairs V through '20» in the order listed in the appen- 
dix. Thus* the Interval between each presentation Item and Its corresponding 
test Itf^m was roughly constant. 

Prooedure > Subjects were tested In small liroups in a quiet room» except 
for one large group thai wa3 tested In a classroom. Presentation and tt^st 
tapes were played back over loudspeakers. Subjects were Instructed to listen 
carefully to a presentation of 24 excerpts from single folk songs and told 
that their mmry for them would be tested later. After the presentation they 
were asked whether any of, the excerpts were familiar* a^d If so» to estlaate 
their number on the answer sheet. Following that» the aifiswer sheet was ex^ 
plained and the test sequence was presented. About five minutes elapsed be- 
tween the presentation and test sequence3. 

For each test item* subjects were asked to Indicate on the answer sheet 

whether they had ^eard that exac^t excerpt before*** and If not» whether they 

ha:l heard either the tune or the words. In advance of the test* subjects were 

given an explanation of the term **tune** (melody) and a description of the five 

types of Items they could expect on the test (types a through e). Thus, they 

were prepared for the test of recognizing tune» words, or exact song» but they 
did not have knowledge of this requirement prior to the presentation. 

Subjects . Subjects were 32 undergraduate students with varying degrees 
of musical training. l^e first t6 subjects, who were tested together In a 
classroom environment, necessarily were al^l assigned to one particular presen- 
tation/test condition. Sixteen additional subjects were divided among the re* 
matnlng foOr conditions. 

Restxcs and Discussion 

Subjects* post-presentation estimates of the number of songs that seemed 
familiar averaged t.4 (out of 24 presentation Items* t2 of which were original 
folk songs). This result confirmed the relative unfsmlllarlty of the materi- 
als. 

For the discussion of recognition scores* we adopt this terminology: If 
subjects Indicated that they recognized an exact test excerpt as one that had 
been heard In the presentation, this Is called an **old sorg** response. Simi- 
larly* If they Indicated recognition of Just the melody or the text, this is 
called an "old tune** or **old words** response* respectively. 

Recognttlon of old songs . Table 2 lists the mean proportion of **old 
song** responses made to the five types cf test items. Subjects correctly rec- 
ognized old songs 85$ of the time* a surprisingly high recognition rate, given 
that the presentation excerpts had been heard only once. Incorrect responses 
were xowest (.07 and .06) whenever new words were heard, ana highest (.39) for 
**mismatched** tune and weren't* where both con^onents had been heard originally 
in different presentation Sv ^gs. 
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Kew Songs 



Table 2 

Hear Propori^lon of "Old tSong" Responses (Exp. ^} 

Words 

New Old Mean 

New .07 .25 .16 

Old .06 .39 .22 

Mean .06 .32 



Old Songs .85 



Tune 



Table 3 

Mean Proportions of "Old Tune" and "Olu Words" Responses* (Exp. 1) 





"Old tune" 


responses 


"Old 


words" 


responses 




New Son^s 
















Words 






Words 






New Old 


Mean 




■Jew 


0?d. 


Mean 


(Jew 


.an .HO 


.12 


New 


.10 


.75 


M 


Tune 














Old 


.53 .63 


.58 


Old 


.13 


.85 


.^9 


Mean 


•18 . 52 




Me*n 


.12 


«2 




Old Songs 




.92 








.92 
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It might be argued that this high false alarm rate for mismatched old 
tune and words (.39) Indlciates some measure of Independent storage for song 
cojnponents. To some degrey, subjects erroneously thought they recognlzed^the 
mismatch songs, apparently because the components were familiar^ though never 
paired In the original presentation. Hote, however, that this effect was 
largely due to a high false-^larm rate for all Items containing old words, 
which probably reflects t,he fact (discussed below) that words were r;jch easier 
to remember than tunes. Nevertheless, this false alarm rate is Ter below the 
hit rate for original old songs, which indicates vhat subjects were more like- 
ly to retain the associa on of a presented melody with its presented text 
than to retain the components independently. 

This is necessary but not sufficient ev^.dence that subjects integmted 
melody and text* To ^ address the issue of int<jgrationi we must examine melody 
and text recognition separately, and determin%^ whether th3se components were 
more accurately recognized in old songs than in any type of new song. 

Recognition of component s* Table i shows the mean proportion of re- 
spdins^s ^o the questions regarding recof^nition of old songs, tunes^ and words* 
In this table "old song** responses are included both in •'old tune" and i,i •'old 
words'* responses, for a response of **old song** indicates that subjects recog- 
nized both the tune and the words* 

The main Nypothesis was that* if melody and text are inte^ra^^^ed in memo*- 
ry* old tunes should be recognized more accurately in old songs than in mis- 
match songs (or songs with new words)* Similarly, old words should be recog- 
nized more accurately in old songs than in mismatch songs (cr songs with a new 
tune). 

Consider first the "old tune" responses. Old tunes were recognized more 
accurately in old songs (.92) than in mismatch songs (*63) or songs with new 
words (,53). The advantage for old songs over mismatch songs w&o highly sig- 
nificant . across subj<^otS| t^t32) = £ < *00^t and across test items, 
F(1,18) = 16.20, £ < .00^. The advantage was e<iually large for* old songs 
pre*^ented in their original folk song version ^and for old songs constructed by 
recombining the melodies and texts of different original folk songs, F(1,18) = 
0.05. 

Consider now the "old words" respon3es. Words 'were recugnized more 
accurately in old songs (*92) than in mismatch songs (.8^) or songs with new 
tune (.78). Because of ceiling effects, the advantage for old songs over mis- 
match songs fell just short of significance across subjects, t^(32) * 2*00, £ < 
*06, but it was significant across items, F(1,18) = 6*35, £ < .02. Once 
aga^.n, it did not matter whether or nc^ the old song was a real folk song, 
F(1,18), ^ 0*18. 

These results suggest that melody and text are integrated in memory to a 
considerable degree. One component is recognized better in the context of the 
other, original component, than in some new context* The advantage for orjigl*- 
nal contexts (old songs) i:olds even over new contexts in which the r^omponents 
are just as familiar (mismatch songs). The deciding factor seems to be not 
whether the components ^re fami liar , bjt rather whether they had been paired 
in the initial perception. Thus melody and text aM>ear not to be stored 
independently; the coi^ponents are stored in some integrated fashion* 
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Besponaea to new sop^ . The data from Experiment 1 allow for a further 
clarification of the Inn^egratlon effect. In the remaining discussion we 
consider *'old tune** and *'old words** responses to new songs only. Two Issues 
are of interest her^. First, were tunes an<J words recojnlzed better than 
chance In these new contexts? A sfc-rong form of the Integration hypothe- 
sls-^that Is, a **wholl3tlc** concept lon"*woiJ Id predict tha'c tunes and words 
cannot be recognized at all outside of their original contexts (old songs). 
Seeofid, aside from the Integration effect descrlbetii ;>bove, there night also be 
a **containln.itlon** effect from compdiilon coiq>onent3 at the recall (not storage) 
stage. That Is, *:ere tune and word judRinents (whether correct or Incorrect) 
Influenced by the familiarity of tne other component? 

To examine the above Issued, separate 2X2 ANOVAs were performed on **old 
tune** and **old words** responses, both across subjects and across Items, Kith 
the factors of tune (old vs. new) and words (old vs. new) whose combination 
represents the four types of **new song** test items. With regard to tunes, 
Table 3 shows a ?ner.i hit rate of .58 and a mean false alarm rate of .42, which 
represents rather poor performance. The difference between hits and false 
alarms was significant across subjects, P(1,30) ^ 9.61, 2. ^ *^^* ^'^^ 
across Items, F(1,18) ^ 2.99, 2. ^ -l^* Thus, tune recognition new songs 
was near chance. The recognition score for words was much hlgheri a mean hit 
rate of .*J2 versus a mean false alarm rate of only .12. This difference was 
highly sl£.<iricant, of course. 

Thus the strong form of thtt Integration argument«-a **whollstlc** concep- 
tion — does not hold up to test here. C&rtalnly, texts were recognized bettor 
than chance In new contexts, and there seemed to be some minimal memory for 
tunes as well. Indicating some degree of Independent storage of componefAs. 
As discussed earlier, components are more accurately recognized in original 
contexts (old songs), but they may also be recognized to some degree In new 
contexts. 

The second Issue concerns the Influence of one component's familiarity on 
Judgments of the other coaponent~a **contamlnatlon** effect. With respect to 
tune^, Table 3 reveals that subjects responded **old tune**, more frequently when 
the words were old (mean of ,52) than when the words were new (mean of .^8). 
This soir^ll effect was significant across subjects, F(1,30) ^ 5.9l, < .C:>, 
but not across Items, F(1,l8) = 1.35. With "respect to words, subjects 
responded **old words*' somewhat more frequently when the tune was old (mean of 
.^9) than when It wsis new (mean of .^'0. This effect was also significant 
across subjects, F(1,30) ^ 5.52, £ < .05, but i:ot across Items, 
F(. 18) = .05.^ 

In summary, Experiment 1 yielded the following results. The main finding 
was that recognition of one component (melody or text) was facilitated by the 
simultaneous presence of the other, original coDfione.jC (In old songs). This 
effect argues for an Integrated representation of melody and text in memory 
for songs. In addition, we found that recognition memory for old songs was 
excellent, even after a single presentation. Our casual observation was that 
this excellent performance was accoiiq}anled by rather low confidence; Many 
subjects felt they were Just guessing. 

However, there Is evidence that tunes, and especially words, can be rec- 
ognized to some degree when paired with new cc-iponents. While this does not 
contradict the Integration hypothesis, It does Indicate some measure of 
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separation of components and argues against the stronger **whollstlc** concep- 
tion of melody/text relations. 

Experlioent 2 k 

The Integration effect In Experiment 1 leaves open at least three ques-^ 
tlons. First*, perhaps the effect was Induced by the requlreoients of a song 
(rafther than melody or text) recognition task. The testing procedure In 
Exrierloient 't required primarily "old song" recognition, and only conditionally 
tune and word recognition. Thus tune and word recognition scores were based 
In large part on correct **old song** responses. It remains to be determined 
whether subjects would recognize tunes or words more accurately in old than In 
new contexts If they were asked to Ju;lge only these coi^ponents. In other 
words* If **old song** responses were not permitted* would thefe still be tin ad*- 
vantage for tune or word recognition In old songs? 

A second Issue concerns the extent to which the Integration effect Is 
sensitive to subjects * strategies at the presentation stage. In the first 
experiment* * subjects listened to the presentation dongs with the knowledge 
that their memory for songa would be tested. Perhaps this Instruction engen* 
dered a global* Integrated memory for melody and text at the presentation 
3tag<i. What remains to be determined Is whether this Integration is optional 
or obligatory. Ir other words* would the integration effect still hold if 
subjects were given the instruction to listen analytically? For example* if 
subjects were told at the presentation stage that their memory lor tupe s would 
be tested; would they be able to Ignore the words? 

A third question concerns the generality of the integration effect. . In 
the first experiment* the presentation and test tapes were recorded by the 
^ame performer^ Thus vocal inflection* timbre* and other variables in the 
performance of melodies and texts would be similar across presentation and 
test songs. It remains to be determined whether the integration effect is 
iufficlently abstract to hold across different performance renditions of a 
3ong. In other words* would the integration effect hold even for a recognl** 
tion test in which the items are sung by a different performer? Moreover* a 
j>o33ible danger to avoid is tt^at old song recognition night be an artifact of 
the acoustical identity of old' songs across the pres«^ntat1.on and test tapes. 
Any physical identity* even an accidental or musically irrelevant one* could 
Have contributed to the old song recognitions in Experiment 1. If the 
integration effect were found to hold across different performers* it would 
prove to be abstract as well as unattrlbutable to the acoustical identity of 
olc* songs. 

j Experiment 2 was ieslgned t^ address these Issues. Specifically* Experl- 
i|ent 2 sought to dete !^ine (1) whether the integration effect would hold in a 
ifelody-only rather than song recognition task; (2) whethier it would hold even 
in the face of instructions to listen Analytically — that is* to tunes only — at 
tjhe presentation s^^^ge; and iH) whether it would hold across different per* 
formers (and performan renditions) or* the presentation and test songs. 

^iethod 

Materials . The mater;Lals were the same as in Experiment 1* exqept that 
the five sets of presentation and test sequences were recorded a second time* 
this time by a female vocalist i^i the alto range* a perfect fifth higher than 
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the male tenp>" recordings. While the same general guidelines w'or^ followed £3 
to teo^o and and other notated oiusical factors* no atteo^t was made to imitate 
the tenor's perfoimance renditions. 

Design . The recordings by male and female vocalis^s allowed f^r four 
combinations of male and female pi esentatiou and test sequences (H/H; H/F; 
f'/M; F/F). These four conditions were further 3ubdivide<J into two irfstruc- 
tion conditions.' The resulting eight conditions were applied across the f . ve 
sets of presentation/test sequences. This resulted in 40 conditions. 

F 

Prowedu re . The procedure wa^ similar to that of Experiment 1, w^th the 
following diffev^iences: Half the subjects recf.'ived the same instructions^^'^^in 
Experiment 1. The other half received ''analytic" instructions: . "Liaf en care- 
fully to these Songs and your memory for the tune or melody only — that is* 
Just the musical ^ port*iors — will be tested later. You can ignore th£ words be* 
cause you will not be testec^ on these.** At the test stagfv all sub^Jects made a 
written response to the question* *^Did you hear this exact melody before?** for 
each item. 

Subjects . Subjects were 46 undergraduate students of varying nusicat 
backgrounds. Each of 40 conditions contained one subject. Eignt additional 
subjects were assigned to ^he first set of the presentation/test *^apeSv 
distributed across the eight conditions of performance rendition and instruc- 
tion. 

Results and Disc^issio n ^ 

Subjects ii Experiment 2 found the folk song materials as lamiliar as 
had subjects in Experiment t. After the presentation of 24 solSgs* subjects 
reported a mean of 1.? familiar songs. 

Recogni tion of tunea . Table 4 compares tune recognition in mismatched 
new songs and in o^fl songs f .r two conditions of performance rendition (same 
voice vs. (different vQice) and two conditions of instruction (general vs. ana- 
lytic). These dat^ were analyzed in a three*way AtfOVA across subjects, for 
reafions having to do with the design of the 'experiment* the performance and 
instruction factors i^ere not included in the AtfOVA across items. 

The results confirmed th€ integration effect fdund iu Experiment 1. That 
iSf even in thi^ tune >"e<K>gnition tack ^ subjects recognized tunes more 
accurately in old songs (mean of 84 acrof j all conditions) thati in mismatch 
songs (;ner of .64), F(U40) = 17.19» £ < .001, across subjects and 
F(1v18) ' .42, £ < .002, acros:^ item;^. Moreover, the. integration effect was 
maintained even for the analytic condition, wher^ subjects were told to pay 
attenticir only to tunes at the presentation stage. There wa5 no significant 
main efiect for instructions or any interaction in this analysis. 

Further, the integration effect held to a considerable degret even across 
different performance renditions. Although Table 4 suggests that th<» advan* 
tage for old songs was reduced, in the different-performer condition, th^ ^ 
interaction cest item type and perforjtance ren^A^^^n was not significant, 
F(1,40) = 3.86, £ < .10* Finally, as in Experiment 1, whether or not an old 
song was a real folk song uade no difference, Fn,t8) = 1.31* 
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Table 4 

Mean Proportion of ^•Old Tune** Responses (Exp. 2) 
Performance: Same Different 

Instructions: General Analytic General Analytic 
Hew 3ongs (mismatch) .60 .69 .69 .60 

Old aongs .9** .9** .73 .77 



Tune 



TabJe 5 

Mean Proportion of ^Old Tune^ Responses: New Songs (Exp. Z) 



General Instructions 
Words 



Analytic Instructions 
Words 





New 


Old 


Mean 




Xew 


Old * 


Mean 


New 


.20 


.57 


.38 , 


New 


.36 


.50 


.4i 


Old 


. .23 


.64 




Old 


.35 


.04 


.50 


Mean 


.21 


.61 




Mean 


.35 


.57 
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We thus conclude that the integration effect Is robust. Melody and text 
appear to, be Integrated in meoaryv even In the face of attenpts to focus on or 
separate the melody at the presentation stage* and even when the performer is 
different at the recognition stage. 

Responses to new songs . It remains to be determined how the effects of 
instruction and performance rendition Influenced (1) the accuracy of tune 
recognition in new songs, and (2) the "contamination^ effect of words on "old 
tune'* responses. The relevant data are shown in Table 5t separately for the 
two instruction conditions but averaged over performance conditions* which 
showed no effect here, . 

The data for new songs u^re analyzed in an ANOVA across subjects on "old 
tune" responses with the factors of tune (old vs, new)* words (old vs, new), 
instruction condition (general vs, analytic)* and performance rendition (6ame 
vs, dl^fferent). In the ANOVA,across items* only the first two factors were 
included. 

As Table 5 shows, tune recognition in new songs was poor, even worse than 
in Experiment 1, The main effect for tunes was not significant in either 
analysis. Although it may appear "that subjects had some success in recogniz- 
ing tunes when the words were old (coo^are hits with false alarms 4n "old 
words** columns), in fact the tune x words interaction was not significant. 
Thus subjects did not recognize tunes better than chance in new song contexts, 
Moreover* tune recognition was equally poor regardless of instructions or per- 
formance renditions. 

However, there v^as a highly significant main effect for words, with sub- 
jects giving many more "old tune" responses when the words were old (mean of 
,;>9) than when^ the words were pew (mean of ,28), F(1,40) = 50,01, £ < ,0001, 
across subjects, and F(1,18) = 37,58 E. < -0001^ across items. In addition* 
this effect interacted vUh instructions, F(1,'4ol =>,15, £ < in that it 

was less pronounced in the analytic instruction conditiotf. 

In summary. Experiment 2 shdWed- that the integration effect\for memory 
original melody and tex% is both obligatory and ab^^ct. Analytic instruc- 
tions did not reduce the integration effect; subjects were unable to ignore 
the words in storing melodies at the presentation stage. Moreover, the 
integration effect is generalizable across different performance renditions in 
the presentation and test stages, ^ ' 

That tunes were recognized so p<vjrly in new contexts would seem to argce 
for an even stronger form of the integration hypothesis-*a "wholistic" concep- 
tion of melody/text r^elations in memory, Ev«n instructions to listen analytic 
cally did not i'oq)rove tune recognition. While it seems possible that there is 
an asymmetry in 'memory integration* such that tunes are mare dependent on the 
words than vice versa, our findings may simply reflect the fact that the tunes 
were much harder to rememoer than the words. This may be an artifact of the 
folk song genref since the melodies were in many ways similar (small r^nge; G 
tonic ; homogeneous rhythm; mostly step*wise melodic motion), but the texts 
were very different from each other. Moreover, texts could be recognized by a 
single saiient word (^^g,, '^Babylon" or ^'turkey")* but the tunes had no such 
advantage. Ultimately* the question of which component is more memorable 
boils down to the nature of the materials. We might imagine a reversal of the 
memory advantage for words if we had selected texts that were very ^timilar to 
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each oth%rv and melodies that were widely discrepant. But In the case of the 
folk songs u^^td In our experiments^ a natural asymmetry exists In the salience 
of texts and meladles. 

Gt.neral Discussion 

We conclude from these experiments that melody and text are Integrated In 
memory to a considerable degree. We found that the familiarity of old tunes 
and ^ords (when mismatched) was an Insufficient predictor o^"" the superior 
recognition for original old songs. Moreover, we found no evldvice that sub* . 
Jects can voluntarily reduce the degree of Integration of TOlody and text. 
Indeed* what was surprising was not only the size of the Integration effect* 
but that subjects seemed tc be unaiware of it. Thus melody and text appear not 
to be stored a3 Independent coinponent s. On ^he other band* a stronger or 
Nhoiistlc^ form of In'^egratlon appears to be untenable* at least as far as 
the text Is concerned. Our results leave operf the possibility that* under 
certain conditions, the melody may be completely Integrated wltn the text (but 
not vice versa). 

In addition to this Integration* there appears to be a reciprocal 
^contamination^ In familiarity Judgments of melodies and tevts. This <!ffect 
may be voluntarily reduced* though not entiirely removed. the effect ltdelf 
'may be an artifact of the selected materials* and It may depend on other fac^ 
tors that are not c^i^ear at present; we have no explanation for the difference 
between Experiments 1 and 2 In the magnitude of the Influence of word 
familiarity on tune Judgments. 

Orio question that Is left unresolved by the present e^iperlments Is the^ 
degree to which tune recognition in old songs may have been due to subtle 
changes Imposed on a me'l^dy by the specific texts einployed. Two possibilities 
are a semantic effect and a prosodlc effect. For example* specific semantic 
connotations may become associated with a Aielody when it Is heard In connect 
tlon with a text about animals* cobbler s* lullabies » dancing* and do I'orth. 
These connotations may facilitate ^tune recognition In old songs or hinder 
recognition when the text different. To take an extreme example (Figure 2* 
Item b)»^ It may be difficult to recognize a melody originally heard Irr^connec^ 
tlon with a bluebird coming through a window* when that melody Is later heard 
In connection with a **old sow*s hide.** 

An alternative hypothesis Is that different texts linpose prosodlc or 
submelodlc variations on melodies. A change In text results in a drastic 
change in the segmental structure of the words* which may have otodlfled to 
some extent lAiat was nominally the same melody. For example* different pat- 
terns of consonants* vowelsj stresses* and voicing may Influence the onset and 
deca/ characteristics of tones and l^he precise degree of stress given to them. 
ThU5 similarity of submelodXc structure may have facilitated tune recognition* 
even across different performance renditions* although It can hard?y account 
for the whole old song advantage. 

We note here a natural asyometry In the relation between (audible) melody 
and text: While a tune can exist perfectly well without any words (when 
played on a musical Instrument* for example)* words always have some kind of 
nune*** If cnly the nonmuslcal one provided by the prosody of spoken language. 
In the context of a song* the musical tune In large measure takes over tt^e ^ 
function of prosody and thus becomes an aspect of the suprasegroeatal proper\ 
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ties of the words. Viewed in this way» it is quite conceivable that memory 
for tunes is more dependent on memory for words than vice versa; certainlyv 
outsi<ie the realm of music the prosody of speech is remembered^ If at *allv on- 
ly as an aspect of the words by' which it is carrj^ed. We hope to investigate 
this interesting parallel between speech and music in future experiments. 
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APPENDIX 

P airs of F ol k Song Excerpts wi th Interchangeable T exts . 
All Folk Songs from Erdel (197^0 



Number /Title 



Number /Title 



1 

• 2 
3 

5 
6 
7 
8 
9 

10 
11 
12 

13 
14 

15 
16 
17 
18 
19 
20 



9: 
12: 
15: 
16: 
21: 
22: 
27: 
32: 
38: 
52: 
67: 
69: 
78: 
99: 
103: 
148: 
110: 
122: 
142: 
2:^ 



1 



Hunt the slipper 
Let us chase the squirrel 
Hho*s that tapping at the wlqdow? 
How many miles to Babylon? * 
Poor little kitty pussi 
I>own in the meadow 
Hush little baby 
Bluebird 
Ida Jted^'*^ 
Dear companion 

I lost the farmer's dairy key 
Old turkey buzzard 
Hold my mule 

When the train comes along 
Housekeeping 

I *m goln * home on a ^loud 
Give my love to Nell 
Cripple Creek 

Goodbye girls, I*m going to Boston 
The boatman 



92 

73 
82 
120 
80 
68 
13 
55 
39 
8»t 
128 
72 
102 
132 
147 
138 
137 
129 
144 
86 



Cape Cod Girls 



1 



Christ was bom 
Mary had a baby 
Nuts in May 
Turn the glasses over 
The old woman and the pig 
Bye, bye baby 
The old sow 

Mama, buy me a chlney jioll 
Wayfaring stranger 
Hatch that lady 
My good old man 
Needle's. exe 
»ye^»r 



1 



Hushaby ( 
My old hen' 
The raggle taggle gypsies 
Blow, boys, blow 
The little dappled cow 
Cradle hymn - 
The Derby ram 



^Mlnor alteration was made in text 
'^inor alteration was made in melody 
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THE EQUATION Of INFORMATION AMD HEANINC FROM THE PESSFECTIVES OF SITUATION SE- 
MANTICS AND GIBSON'S ECOLOGICAL REALISM* 



M. 7. Turvey+ and Claudia Carello-h+ 



^ Introduction 

^ As Barwise and Perry suggest, their theory'^of meaning is consistent on 
several fronts with Ecological^ Realism as it has been developed by ^the 
psychologisIP James J. Gibson. The most important convergence from our per- 
spective is the shared conviction that meaning is neither in the brain*-t)ie 
residence openly preferred by t^rthodox psychologist s*-nor in some 
netherworld«-a location intimated by^Fregean semantics. Rather, meaning is 
contained i^i the system defined by the nested rtjiations between the real prop- 
erties of a living thing and the real properties of the environment with re- 
spect to which the living thi^ng conducts its daily affafrs. 

How is this type of reali3t account of meaning supported? Both Gibson 
and Barwise and Perry have atteirpted to ground meaning in information. . But 
both are extremely careful about the sense in which information is to be used. 
Gibson (1966) pointed out tha^ information theory in the style of ShannoA 
(1949) was not adequate to the demands of perjieiving-^btaining information 
about activity-relevant properties of the environment. Whereas information 
for communication engineering is assumed to be finite and transmittable* 
information for perceptual systems is inexhaustible and noticeable (i.e.* not 
carried, as through a channel) (Gibson, 19T9). To characterize information as 
a quantifiable reduction in uncertainty does, not require a con5ideration of 
meaning; to characterize lnformation\.as the specification of the observer's 
environment demands it. Similarly, Barwise and Perry deny Dretske's (1981) 
assertion that meaning and information are dissoclabl^e. Instead, situation 
semantics and ecological psychology place what Barwise and Perry call ''con* 
strain ts on the structure of* reality" at the heart of their atteiq>l>3 to 
consider meaning and inforuiatlon conjointly. That is to say, if an event. A, 
is linked systematically to another event, B, A is information about B; the 
linkage is ibc^ningful. 

it is in the nature of the linkage that, the different emphases of the two 
approaches can be seen. Gitson (1954, 1966) ider.tified three such linkages: 
convention, projection, and i^^^ural l^w. These underwrite the relationships 
between, for example^ an automobile and its license, an automobile and its 
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shadow, and a moving automobile and the optical flow pattern It generates, 
respectively. Exa3?)les of tbfe last type^-^hat Barwlse and Perry refer to as 
Information based on noriilc structural constraints — are at the core of the 
Gibsonian program. An understanding of the information required for animals 
to control locomotion, in a cluttered surround is considered propaedeutic to 
understanding! information of the other types. The focus is on uncovering laws 
at the ecological scale (i.e., appropriate to a given animal-^coniche system) 
(Turvey, Shaw, Reed, & Mace, 1981) that underlie information in the specifica* 
tional sense (Reed, 1981; Turvfliy i Kugler, 1984, in- press), captured as fol* 
lows : 



generates by law 



Situation-type A 



Situation-type B 



(-1) 



specifies 



Information in the pictorial sense and information in the irKTicationil 
sense (that centra] to lingMistic meaning) can be schematized similarly: 



Situation-type C 



produces by projection 



applets 



Situation^type D,' 



(2) 



and 



is conventionally linked to 
. ^ > 

^ituation*type E Situation-type F, 



(3) 



indicates 



resp'actively. For Gibson, both of the^e are predicated on Information in tm 
specif icational sense. A representational picture, for Example, is a surface 
treated in such a way as to make available some of the same (formless and 
Mmeless) invariants that are available in the real scene (Gibson, l9'/9>. In 
the se^me vein, the syznbolic waggle aance of the be€ indicates the location of 
a source of honey in the invariant pattern of dips and twists^ In all cases, 
meaning iw there to be discovered, whether the Animal is inmersed in a lat^rful- 
ly structured ^ea of energy, encounters an arrested arrr^y of persisting 
invariants, or confronts culturally determined conventions. That is to s3(y, 
even if the constraints are at some remove from the a'nimal^^nvironnient systeQi, 
each new individual need not reinvent or recreate them. Rather, the 
systematicit^y of the relationships must W no^;Jced« Sut^ the fundament ality of 
information in' the- specif icational sense runs still deeper. In order for 
information in the indicational sense to be efficacious, information in the 
specificational sense already must be available. For example, in order for a 
stop sign to regulate the dynamics of traffic flow and, th#i*e/ore, for its me- 
aning to be realized, information specifying the retardation of forward motion 
and the time-tp-contact with the place where velocity imast go to zero, must be 
available. 
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We will pursue the notion of InformatlWi in the specif Icatlonal sense in 
the section that follows In an effort to supt>ort the arguments of Gibson and 
Barwlse and Perry that meaning and Information gSd be equated. 

information In the Specif Icatlonal Sense and Sltuatlon-*type Meaning 

Consider a transparent medium (air or water) that Is densely filled with 
light scattered by a substantial^ sUrface below. Now consider a point of 
observation that Is moving In th« medium rectlllnearly v'elatlve to the ground. 
In order to define an optical field that flows relatlite to the point of 
observation* each point of the ambient ^ght can be assigned a vector that Is 
opposite that of the vector of the point of observation. If» for example* the 
point of observatlcn Is moving toward a point* then the optical fl^d will 
flow outwards from that ^target point*. That Is* there Is a lawful relation 
of the type: forward rectilinear motion of a point of observation (F) — > 
global optical outflow (0). (This Is, an Instance of a more general law of 
ecological optics forirulat^ed as: ,a particular motion of a point of observa^ 
tlon relative to a surround — *-> a particular global transformation of ^the 
ambient optical f Uld. } Turning the relation arouhd* global optical outflow 
Is said to be Information about forward rectilinear motion of a point of 
observation In the sonse that* given that there are no other natural ways of 
producing global optical outflow (Turvey* 1979)* global outflow Is specific to 
forward rectilinear motion. This Is Glbson*s way of defining the Information 
contained In the light— It Is opt ical structure lawfully *^ generated by^ the lay- 
out of surfaces and by movements of the point of observation relative to the 
layout . We cdn capture the essence ^f the Glbsonlan vlewln terms of Barwise 
and Perry's situation-type: ^ ^ ^ 

. lawfully generates 

Sltua tlon -type F Situation-type 0. (4) 



^ specifies 

Put very sfmply* und^r Gibson's ecological anali^sl^ 0 means F. 

' In many circles, however* there Is a reluctance to use the term 'means' 
cr^ to construct a phrase of the form 'O's meaning Is F' I'n the absence of a 
living thing* an agent. Thus* for Barwise and Perry a relation such as (4) Is 
only one half of their theory of meaning as- It might apply to a given animal. 
The other half Is^the attunement of the animal' In question to the relations. 
For them* Information and meaning are ^uated but the equation holds* strdctly 
speaking* only when^ there Is attunement of the animal. In short* In darwise^ 
and Perry '5 situation semantics* the meaning of an event o that Is of the 
event-type 0 Is a product of the relation F — >0 and attunement to the rela- 
tion. Putting a living thing. that sees and locomotes (and which* therefore* 
must be attuned by definition) at tfie point of observation relative' to an 
artificially generated outflowing global^^ optical field* underscores the Iden- 
tity of Information and meaning to which the ecological approach and situation 
semantics subscribe* . If 0 means F* then for a human observer maintaining an 
Yiprlght stance*- global optical outflow will Induce backward postural adjust- 
ments since forward movement rather than ^ vertical stasis la 'occurring* ' 
Experimentally^ this Is shown to be so (Llshman & Lee* 1973! & Aronson, 
1974). . ^ • ' ^ 
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We are not fully comfortable with the notion that the relation of 0 and P 
can be talked about In two ways; (iX^aa *0 informs about F* £n the absence o*" 
^ an attuned agent» an^ (2) as .'0 means F*' given an attuned ^gentv Our 
'discomfort arises from the de«lre to develop a consistent direct realist poal** 
\ tlon In perceptual theory (Michaels & Carello» 1981; Ti^rvey et al.» 1981; 

/ Shaw« Turvey* & Mace» 1982) an6 our recognition of how elusive this realist 

goal has' been In the past. It may be a quibble but It seems to us that » re^ 
allst perspective is undercut to the degree that we cannot talk clearly and 
confidently about situations and events as having meanings for the activities 
of organisms* regardless of the psychological states of organisms. Given the 
lowoleyel development of the conoept of attuneme^it In situation semantics* 
there Is a danger that attunement might be read In a psychologically, contrlbu* 
tory sense^ viz.* the organism is able to Interpret the Information* that Is* - 
to ascrlbft meaning to the Information. 

From a realist viewpoint* meanings are discovered by animals* not Invent* 
ed or created by them. The nomlc structural constrains of ecological optics 
relate kinetic ind kinematic facts at the ecological scale to optloal struo^ 
ture. They are the sine qua non for the evolution of visually guided locomo-, 
tlon* whether the forc^es*for locomotion be produced by legs* fins* wings* or 
machine (Glbaon* 1979). 1>k say that the lawfully produced optical properties 
are* merely information fafis to convey the exlsl^entlal lin>ort of the nomic ' 
constraints of ecological optics: They hate been the basis for the suc^e^sful 
locomotion of an Indefinitely large nuiirt>er of species for a "Very long period 
of time. Indeed* we would speculate (and* we hope* not glibly) that attune^ 
ment to the5e constraints could not have come about unless they were already 
meaningful* that Is* unless the kinetic consequences of a (naturally oocur^ 
ring) global optical pattern always held. 

' To fix this equation of Information (In the speclflcatlonal sense)'^and 
meaning*^ consider a point of obsery?«tlon moving on a rectilinear pat^ that Is 
Interrupted by a subitantlaX surface perpendicular to the ground. The struc* c 
tured. light to the point cf observation is usefully construed as nested visual 
solid angles with the point of observation as their common vertex .(Glbaon* 
1979).. ^rudely speakln^^ the larger solid angles correspond to the faces of 
surfaoe , layout and the smaller solid angles correspond to the facets. a 
movlnjf point of observation approaches the substantial surface on Its path» 
the corresponding visual ^Qlld angles will dilate. Analysis shows that the 
Inverse of the rate of. dilation Is a .global property that Is specific to the 
time-to-«ontact between the jjolnt of observation and the' surface (Lee* 1976* , 
' c 1980). To be somewhat pedantic* when a point of observatlot}^ approaches a sur^ 
face unde" constant force conditions (the kinetic perspective) and* therefore* 
at a constant velocity (the kinematic perspective)* It defines a physical 
situation such that* for'any dlsta^nce between the point and the surface* there 
Is a corresponding time before point, and surface contact. The light Is law- 
fully structured by this physical situation of immlnj^nt collision such that 
there are optical properties unique and specific to the facts th^t a collision 
will occur and that It wl^ occar at^ a certain delay. We can Identify the- 
tlme-^o*K2ontact optical property* T(t)* then* tn the terms of Barwls(t and 
Perry: 

; ■ " lawfully generates 

Situation-type C(t) Situation-type T(t)* (5) 

' , specifies 
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wher<? T(t) *meaa3 one thing and one thing only* namely, contact C, at so , many 
seconds from now if th« current ccnditions of motion persist. Contact will 
occur whether the point of ol^ervation is filled by an attuned agent, a blind 
agent* or a trolley. A given value of '^(t) means collision at> a certain tine. 
This fact of nature is the sort of meaningful invariant to which perceptual 
systeo!^ could adapt and become sensitive or attmied. That T(t)^is mepnicutful 
in the way we have suggested is showi) in Its use by gannets in controlling 
their diving for fish (Lee & Reddish, 1981)* by, flies ir» initiating their 
deceleration prior to contacting a surface (Wagner, 1982)« and by humans in 
leaping to hit a ball (Lee* Youn'g« Lleddish* Lough, & Cla^^toii* 1983). 

^ As w^ have noted* t(^) is information about an upcoasing collision if the 
curgfent conditions of motion persist . Obviously* the collision need not be 
inevitable if the conditions of motion are changed in appropriate ways* for 
example* if ' the point of observation stops or veers to the side, ^r^ver, 
the strength and timing of the collision can be controlled (as <]eaio nitrated by 
the examples above) if the point of observation accelerates or decelerates ap- 
propriately. Is there Information for what is appropriate? For example* is 
t!h«r^ Information spec^ic to the circumstance *d4^celer&tion is sufficient to 
.come to a halt before conta^tifig the surface*? Such control infcA^mation is 
available in the first ^deri^tive of the time-to-contact, variable* dT(t)/dt. 
In particular, if dT(t)/5t*^*\r0.5> then deceleration is sufficient and there 
will not be contact; if dT(tS/dt < -0.5, there will be contact. 

This *type-of -Contact** variable is of particular interest because it is a 
dimensionless quantity (i.e.» it Ijs not attached co any units of measurement)' 
that distinguishes natural ^categories:' contacts vs. noncontacts. The cate* 
gory boundary does not change*-the meaning of the situation does not 
change-^itb ch^anges in spee^d of the observation pointy its distance froD the 
surface, or the si^e of the surface. The information specifying the category 
boundary is lawfuly produced by the movement of a point of observation with 
respect to a surface. We have suggested elsewhere (Kugler* Turvey, Carello* & 
Shaw* in press; Tu^vey & Kugler* in press-a) that dinensionles^ quantities 
that mark off distinct specif icational states play the same significant Vole 
in law-based explanations of ^the control of activity as dlmnsiohless quanti-> 
ties that mark off distinct physical states play in law-based explanations of 
cooperative phenomena. - - , , a 

How Situation-type Meanings Becc^ne S^tua^ion Meanings 

The above are examples of optical properties lawfully linked to particu-> 
lar relationships between a moving point of observatJ oft and a layout of 
surfaces. The^ ar^ exan^les of nomic structural constraints that underwrite 
situation-type (or event-type) meanings for locomotipg agents, if agents; hap-> 
pen to be about. B^^llding the laws of ecological optics around an unoccupied 
point^ of observation is an important move: The laws are thereby se«n to be 
general and public* in ^hat any observer* in pri^^ciple* can occupy any point 
of observation and share with other observers over time the Invariants in the 
ambient optio array tl^ that point. Given the fact that there sitaation'type 
meaaings are observer^^indif ferent* however* we need not expect them to deter- 
mine activity fully when an observer is brought into the picture (Just as we 
do not expect the laws of motion by themselves-^perating* as they. must, with- 
in certain boundary conditions*^-to rationalise fully a given particle*s 
trajectory). An occupant at a^ point of observation transforms a situa- 
tion-type meaning into a situation meaning and whfte the latter depends on the 

2or 
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former, it is not identical with^ it-^s Barwiae and Perity take great pains to 
note. \ ^ 

We Kiah^to ahow.'^a do Barwiae and Perry, that there is' nothing cpocky 
about this transformation of sXtuatiSn^type meaning into situation meaning. 
An observer occupying a point of observation will have a -aagnitude (height, 
weight) that d&fines an intrinsic scale for the laws of ecological optics, and 
a repertoire of ef fectivities (goalvlirected. activities) that define the uses 
to which the information bafed on these laws is to be put. A5 it is with Bar- 
ifise and Perry's discourse situations r- gonnections , and resource situations, 
which squeeze different situation meanings out of an invariant linguistic 
situation-^ype meaning (underwritten by conventional stV*uctural constraints), 
so^ it is wtth scale and intention, which squeeze different situation meanings 
Guc of an invartent action situation^ype meaning (underwritten nomic 
structural constraints). We expect that, formally speaking* these two sets of 
'boundary conditions' may have much in common* Let us concentrate, however, 
on exaaples of h0W scale and intention produce situation meanings. 

. ^ The motion (FM of^a'point of observation .over one surface towards a 
drop-off to" another lower surface will lawfully generate^an optical flow (OM 
distinguished by a discontinuity, viz.;' % horizontal margin, above which optl^ 
cal structure 'magnlfie5 and gains and beloH which optical structure>-magnif ies 
but (tpesr not gain.^ This nomic 'structural constraint and jthe Information in 
the specif icational sense that it yields can be represented as; 

lawfuliy generates. 

w»> , 

Situation-type F' Situation-type 0' (6) 

% ^ ipftcif ies 

^ 'I 

Th« situation^ype meaning of 0* is 'cpproaching a brink*. If the point, of \ 
observation is occupied* say, by a running, four-legged animal, then the 
situation-type meaning is too general and insuff Indent ly constrains the ani- - 
mal 's -behavior. The richer* particular meanings of 'approaching a step-down 
place' or 'approaching a Jump-down place' or *appro:aching a falling off place* 
are required for the iiuccessful control of locomotion. These meanings are 
situation meanings. They^depend on the magnitude of the brink relative to the 
size of the animal. What is a step-down place for one animal (e.g., a horse) 
is a iumpHlown place or a falling-off place for another animal (e*g., a 
mouse).' 

*» 

The orthodox move is to treat these situation meanings aa 
subjective—that is, as mental categories iii|K)sed on an objective, meaninglesa-. 
surround. This is where the spookiness creeps in. Gibson *s ecological real- 
ism and Barwise and Perry's situation semantics reject this move to subjective 
categories. Rather, the situation meanings in question must ^ underwritten 
by scaled nomlc structural constraints* they are real relations between real 
properties of the aniaal-environmeat system to which the animal can become 
attuned^ 
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The strategy, roughly speaking, Is to note tnit (a) the magnltuoes of 
surface layout ^^re Jescrlbable in units of the animal such .as eyn height or 
stride length; (b) above soaie critical number of a body-scaled unit such as n 
(eye heights), a drop-off cannot be negotiated by stepping down; (c) the opt- 
ical flow can be shown to specify surface layout in body^caled units (Lee, 
1980); and (d) given (o), there Is a dlmenslonless optical property like 
dT(t)/dt that nwrks off at a critical value distinct specif Icatlonal states, 
viz.,* ^approaching a step^ownab^e place* and ^approaching a non-step-^ownable' 
place* (Turvey i Kugler, 198^). ThJt Is, given the optical structure 0* 
fashioned by any point of observation approaching any brink In ^ surface, 
there Is a scale transform s effected by ^ particular anlmf^l a at the point of 
observation such that s(0»)— >0«, whefe 0** Is the optical structure specific 
to the brink In the scale of a. In Barwlse and Perry*s ^^terms, 0* Is ,effl- 
clent-^lthough its . meaning Is fixed ('brink*). Its "Interpretation" 
f'^step-downable, * 'not step-downableM Varies with s. , To nrltwate another 
central theme of situation semantlcit, "...efficiency Is crucial to all mean^ 
Ing." 

A similar scenario can be written for the role of Intention in transform- 
ing sltuatlon^type meaning Into situation meaning. For example, a .baseball 
fielder bent on catching a fly ball transforms situation-type meaning 'Impend* 
Ing collision* Into *thlng to be Intercepted,* while a boxer with a glass jaw 
transforms 'Impending coll'lMon* Into **.hlng to be avoided.* Each Intention 
defines a natural category (selects values of the final conditions cf a law) 
such as *hard coQtac;^, * which, in turn, specifies the activities that will 
proAjce that category (I.e., constrains ^values of the Initial conditions of 
ihe law). Just as understanding the Interpretation of an utterance requires 
understanding Its context of use In Barwlse and Perry's terms* so Mntlerstand<^ 
Ing how an Intentlori transforms^ a sltuatlon^ype ioeanlng into a situation me- 
aning requires understan'ilng the context of laws under which th^ Intention 
brings the animal. Including the convention that ^defines the Initial condi- 
tions to be assumed given the final conditions to be obtained (see Turvty et 
al., 1981, for a more thorough discussion of this line of reasoning^. 

Throughout Glbson*5 ecological '^eallsm and Barwlse and Perry*s situation, 
seaiantlcs Xs a commltmrtnt to treat meaning as an aspect of reality. This sort 
of treatment gives rise to explanations of meaning that appeal to natijral' law; 
understanding meaning Is not qualitatively different froi) understanding other 
natural phenomena. The strategy is reinforced* for the Glbsonlan program. In 
never losing sight of the control of Tocomotlon as the paradigmatic problem to 
be understood. Locomotion - Is a skill that is not United to humans and, 
therefore, the temptation to ascribe It to special cental powers Is lessened. 
Barwlse and Perry,, on the other hand, are trying to be realists In a bailiwick 
where mentalese Is at Its most alluring. We applaud their efforts. 
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1 ' r ^ £ 

Although ue agree with the distinction that Barwlse and Perry draw be- 
tween aituation-type and event-type, for purposes of exposition we use situa- 
tion-type for both circumstances. 
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A COHHENT OH THE EQUATING OF INFORMATION WITH SYMBOl STRINGS* 
M. T. Turvey+ and Peter N. Kugler++ 



Physicists have been included in the past to.^ regard **infori0ation** as a 
physical variable similar in kind to energy or matter (e.g., Layzer, t975; 
Tribus i Hclrvine, 1971). There are objections, however, to carrying this 
inclination over to the realms of biology, physiology, and psychology. The 
equating of **information** with negative entropy or an absolute measure of 
objective order does not adequately capture the ways in which the term *'infor- 
mation** is used in explanations of the phenjg^mena characteristic of those 
realsis. There is a very general impression that the various explanatory roles 
ascribed to '•information*' in biology, physiology, and psychology are per form- 
able by symbols organized by a granmat^. ^ 

What are the roles that symbol strings fulfill? Fundamentally, they are 
indicational and injunctional: Symbol strings can indicate states of affairs 
(e.g., '•deficiency of metabolite so-and-so"; '•road work ahead'^) and they can 
direct or conwand states of affairs (e.g., Velease hormone so-and-so !*•; 
••slow 'down!*^). This popular quasi-linguistic view of '•inf ormat ion *•— what 
might be termed the indicational/injunctional sense of information (cf. Reed, 
1981; Turvey & Kugler, 19ftW)— is central to ^he papers of Bellman and CDld- 
steiji, and Iberall. Our efforts iw this brief note are directed at putting 
this indicational/injunctional sense of information into perspective. Insofar 
as the issue of the continuity of linguistic and movement capabilities in* 
volves the concept of information, clarifying the different senjses of the con* 
cept, and their relationship, will prove helpful. Two rather different sets 
of arguments are involved— those attributed to Howard Pattee and those 
attributed to James Pibson. , ^ 

Pattee (1973, 1977) has identified tub modes of complex system function- 
ing: A discrete node characterized as rate-independent operations on a finite 
set of symbols, and a continuous mode that refers to the rate-dependent 
interplay of dynamical processed. Given this distinction, ^-^one can ask how 
symbol strings and dynamics coevolve frpm tlie cellular level up through the 
evolutionary scale. More pointedly, the question can be raised: Are there 
universals of symbol string/dynamic^ interactions that might be appropriate to 
an understanding of the linguistic and coordinated movement capabilities of 
living systems? Pattee addresses these questions through the problem of 



^ American Journal of Physiology i in press. 
^Also Department of Psychology, University of Connecticut. 
^Also Crump Institute for Hedical Engineering, University of California, Los 
Angeles. 
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enzyioe folding. This particular exanple consists of two qualitatively differ- 
ent phases: the genetic code synthesizes an amino acid string* which then 
folds Into a functioning enzyme. The translation of the DNA symbols into 
aniwo acid strings is a discrete symbolic process* while the folding of the 
one-dlmenslonal atnlno acid string Into a three-dimensional machine Is a 
continuous dynamical process. The former ts a constraint on the latter. To 
describe the relationship as one of constraint is an Important ^tep for 
Pattee* for it suggests that the system's meaning — its dynamic ablllty~doe3 
not merely reduce to a symbolic representation. The synibollc mode harnesses 
the forces responsible for the function* but the symbolic mode Is not equated 
tflth the function. But neither is the dynamic mode completely autonomous. 
The folding of the enzyme cannot proceed until the code provides the necessary 
constraint. In other words* neither node alone is sufficient for the sctlvlty 
in question. 

Of significance is the observation that the discrete symbolic mode* 
inforntation in the lndlcatlonal/Injunc^^gil42^s^n3e* is kept to a minimum in 
natural systems (Pattee* 1980). I n form t ion^'^^Mt ru ed quasl-llngulstlcally 
does not provide all of the details for a given pn^ess; it act^ as a con- 
straint* of the nonholonomic type* on natural law so thet the dynamic details 
take care of themselves. In other words* by Pattee's Analysis* most of the 
complex behavior of living systems is essentially self-assembly that l5 ''set ^ 
up** by symbol strings* but not explicitly controlled- by them. Presumalby this 
should be no less true of the linguistic and movement coordination 
capabilities of biological systems. For Pattee* coiq>lote comphrehenslon can-* 
not be gained by appealing to symbol-^strlng processing or to physics alone. 
Both must be used together* but in 3 special way. Pattee advises: Use phys* 
Ics cleverly so that symbol strings peed only be used sparingly in order to 
assure the parsimony of the explanation. 

As noted* symbol strings are Incomplete — they are limited in detail with 
respect to the detail of the processes that they indicate or direct. A number 
of perplexities are generated by this Incompleteness. For example* on what 
grounds and by what means does a particular symbol string get created rather 
than another* referring elllptlcally to one set of properties of the indicated 
or directed dynamical process rather than another? What determines the detail 
of the indicated or directed dynamical procesS that the symbol string repre- 
sents? Taken together* these two questions require an answer beyond that giv- 
en by a physics (e.g.* Prigogine's Dlsslpatlve Structure Theory* Iberall's 
Homeoklnetlcs) that seeks to explain how structure evolves with a consequent 
loss of dynamical degrees of freedom. What is required is an explanation of 
how that loss is special yielding a symbol string* an alternative description 
(Pattee* 1972)* that is privileged with respect to the dynamical process that q 
it indicates or directs (see Carello* Turvey* Kugler* & Shaw* 198^). There 
are shades of the probler of induction (Goodman* 1965) here* the problem of 
projectable predicates or properties* which continues to resist solution in 
conventional philosophy and psychology. Consider another consequence of 
Incompleteness. Because of its necessarily reduced detail* a symbol string 
cannot specify a process or act* that is* it cannot provide a lawful basis for 
the process. This is not to say that information in the tndlcatlonal/ln Junc- 
tional sense cannot be responsible for a process in part* only that it cannot 
constrain a process in full. Pattee's paradigmatic example ts meant to sug- 
gest that the known laws of physics complete the picture — filling in what the 
syisbol string leaves out. But we dcHibt whether all relevant examples succumb 
to this solution* tout court. It seems to us that in many (if not most) blo- 
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logical settings the dynamical details "take care of themselves" because there 
Is non-'^ymbollc Information that specifies tjow they should do so. As Iberall 
and Soodak (in press) express lt« a cooperatlvlty Is a state of affairs of an 
ensemble that Is maintained from below by the activity of the atomisms ef the 
ensemble and from above by the field boundary conditions (equated with 
nonholonomic constraints qua symbol strings in roost biological Instances). 
The intimation below is that cooperativlties Involving biological atomisms are 
predicated in large part on information in a non-symbolic sense that is made 
available in the course of atomistic activity. 

Gibson's (1966, 1979; Reed & Jones, 1982) focus has been the control of 
locomQtory activity in natural cluttered surroundings. His definition of 
Information is explicit and distinct from the orthodox sense of information as 
indlcational/ln.lunctional. For Gibson* information in the case of vision is 
optical structure that is lawfully generated by environmental structure (the 
layout of surfaces) and by movements ox" the animal (both roovernents of the 
limbs relative to the body and movements of the body relative to the sur* 
round^)* The optical structure does not resenble the facts of the ani* 
mal^nvlronment system* but it is specific to them* in the sense of being law* 
fully dependent on them. In short* Gibson's sense of Information is specifl * 
cational . A slnf>le exaipple illustrates the relation between the two senses of 
information* Gibson's and the orthodox. Symbol strings on the highway of the 
type. SLOW DOWN and STOP are Intended to direct the dynamics of traffic flow. 
For atomisms (humans) that can read the symbol strings* coiiq)lylng with these 
injunctions is possible only if there is continuously available information 
specific to the retardation of forward motion and the time to contact with the 
place where^ velocity is to go to zero. A deceleration of ^obal optical out- 
flow specifies the slowing down of a moving point of observation relative to 
the persistent* non*movlng layout of mirrounding surfaces. The Inverse of the 
rate of dilation* of the visual solid angle to the point of observation that 
is created by approach to the place where ootlor is to be fully arrestee:^ 
specifies continuously the time at which the place will be contacted. And the 
first derivacive of the time^to-eontact optical property specifies that the 
forward motion will or will not be arrested in time under the current condl* 
tions (forces) of motion. - (See Gibson* 1979; Kugler* Turvey* Carello* I 
Shaw* in press; Lee* 1980* for a detailed discussion of each of these forms 
of specification. ) This exanple suggests that without Information in the 
specif Icational sense* Information in the indlcational/lnjunctional sense is 
impotent. Further* this example suggests that for a given process* the degree 
of detail in a 3>mbol string is Inversely related to the availability of 
information in the specif icational sense. At the very least* the information 
available in the specif Icational sense determines the lower bound on the de* 
tail of Information in the indlcational/lnjunctional sense. 



Stated in more general terms* Gibsonlan information is a physical vari* 
able that can be identified with low-dimensional macroscopic properties of 
low^nergy fields lawfully generated by properties of 3ystem*and*3urround 
(Kugler et al** in press). For a system that has an onboard source of avail* 
able potential energy (such that it can resist the surround *3 forces through 
the generation of forces of its own)* information in the Gibsonlan specifics* 
tional sense is the basis of the system's coupling to its surround. Where a 
convention* abstractly interpreted* leads the system to take a nongeodesic 
path (route)* information in the specif icational sense provides the support by 
which this elected activity is made possible. 
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In suinnary, the points we wish to underscore ar$ these; <1) the indica- 
tional/injunctional sense of information Is not ex'^lusive; (2) information in 
tht indicatiohal/injunctional sense is predicated on information in the speci- 
ficational sense; and (3) the perplexities surrounding the incompleteness of 
symbol strings may be dismissed in a principled fashion by av thoroughgoing 
analysis of information in the specif icational sense <cf. Carello et al., 
^96H; Turvey & Kugler, 198**). 
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AN ECOLOGICAL APPROACH TO PERCEPTION AND ACTION* 
T* Turvey+ and Peter N, Kugler++ 



1*0 Introduction 

In his chapter on "Some ' emergent problems of the regulation of motor 
acts" Bernstein (1967) Identifies four major problems; 

(1) If perceiving were not a matter of being accurately aware of the objective 
f^cts of the environment and of one*s actions* then the reliable control of 
activity would not pooslble* However* the orthodox theory cf receptor pro- 
cesses Implies an arbitrary relation between these proc«^sses and the clrcum-^ 
stances—environment and action*^ to which they nominally refers This theory 
is fnadequate to explain the everyday achleven ts of animal activity* What 
Is needed Is a theory that accounts for how perceivings keeps an animal In con-^ 
tact with the rea?lty that bears on the successful conduct of its actions* 

(2) Patently* animal activity Is an Instance of selfn^egulatlon* but what kind 
of self H^egulat Ion? Is It of the type conventionally expressed by 
selfH-egqlatlng artifacts or do the regularities of animal activity, follow 
from principles that are* as yet* unique to natural systems?' 

(3) Neither the geometry nor the kinematics of movement can serve* In the gen- 
eral cdse* as the determinant of the coii|)osltlon of an act* An action is what 
it is by virtue of its intention* that ls» the motor problem (a needed change 
in the relation of the animal and its environment) toward which the actlop is 
directed as a^ solution* How are we to understand an intention as (a) the 
principle guiding the overall formation of an act and (b> the Influence 
dominating the selection of its details? 

(4) Clearly the control of activity i' jiore than a retrospective matter* In 
the most general of cases » controj. <^st be prospective* For example* in 
basketball* one exerts forces against the ground of a specif ic^magnitude so as 
to cause the hands to be at a ^^eclflc height at a specific time to Intercept 
a thrown ball* What is the basis of this anticipatory capability that makes 
possible the realization of any goal-directed activity/ 
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Problems (1) and (4) are discussed in Section 2.0 and problems (2) and 
(3) are discussed in Section 3.0. 

2. 0 On the Objectivity arid Accura cy of Perceiving 

For any animal » activity takes place with respect to surfaces. For 
terrestrial animals* the most Ir^portant surface Is the ground. The ground Is 
not even. Neither Is It geometrically and materially uniform from place to 
place. There are gradual and sharp changes in the'' ground level. There are 
cracks and gaps . Liquid and solid areas are Interspersed . Further » the 
ground surface is cluttered with closed* substantial surfaces. Some of th>^e 
are attached* others are movable and some move under their own power. Th^ 
clutter varies greatly in size. But for any terrestrial animal there are al- \ 
ways cl<^$edj ^substantial surfaces both smalle- and larger than its size. Some 
of the ground's clutter are barriers to locomotion, but ihvarlably there are 
gaps large enough to permit passag^e and barriers small enough to be hurdled or 
climbed. Locomcting from place to place» finding paths through the flutter* ^ 
is necessary given the uneven dlstributlolt of the resources on which the per- 
sistence of the animal depends. 

As Bernstein remarks^ the meaningful problems that activity solves arise 
out of the layout of surfaces surrounding the animal» the environment. A few 
such meaningful problems^ are depicted in Figure 1. Awareness of the ^'prob- 
lems** and awareness of the activities that do or do not solve them Is the role 
of perceiving. It is obvious to Bernstein that perceiving (both the layout of 
surfaces and activities with respect to the layout) must be ^'objective*' and. 
^'accurate." If perceiving fell short of these requirements— if it were> on. 
the contrary^ ^'subjective*' Bn4 ''inaccurate*'— then meaningful^ adaptive activi- 
ty would not be possible. Bernstein writes (1967, p. 1l7^: *We may consider' 
the formulation of the motor problem, and the perceptl4h of the object in the 
external world with which it is concerned^ as having their necessary 
prerequisites in maximally full and objective perception both of , the object 
and of each successive phase and detail of the corresponding moveisent which is 
directed towards the solution of the particular problem.*' What Bernstein 
says seems straightforward enough: perceiving must keep an animal in contact 
with its surroundings and with its behavior. It will be argued» however^ that 
a number of fairly rad^ical ^te^is have to be taken to Insure that the theory of 
pVception that we develop as scientists can live up to the natural demands 
placed on perception by normal activity in cluttered surroundings. 

Clearly, Bernstein believes that the role of *'af ferentatlon*' in the guid- 
ance of activity is the significant role, even though af ferentation'^ as a trig- 
ger of reflexes has a better scientific pedigree and .is better understood by 
physiologists. The triggering role of afference assumed prominence because of 
the tendency to focus research on ' tifjcial movements— discrete responses 
made to momentary and punctate^ stitnuli- rather than on activities resolving 
environmentally defined problems. Bernstein considered this triggering role 
of afference, developed as it was in the context of the reflex arc, to be ^ 
overvalued and pointed out two unwelcome consequences of this overvaluation. 

First, it established a bias to equate receptor processes in general with 
signals that release or inhibit reactions. Bernstein reminds us. that this 
f. equation leads to the unacceptable interpretation of the receptor processes 
accoopanytng linguistic events as just triggers — the so-called second-signal- 
ing system. Closer to the present concerns, he points out that t^^e emphasis' 
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Figure 1. A small sample of the meaningful problems that the surrounding lay- 
out of surfaces poses for a looomctlng animal. 

on the afferent triggering of reactions obscured the fact that afference 
modulates ongoing movements. Bernstein saw u distinction between the tradi- 
tional physiology of reaction and ^the physiology he wished to prbmote~a 
physiology of activity (Gelfand* Gurfinkel* FoDiin» & Tsetlln» 1971; Reed^-- 
1982a). In this respect (and others) he was of kindred dplrit with Gibson 
(1966» 1979» 1982) in rejecting the classical vi^w of action ds (merely) re* 
sponsed triggered by signals emanating from either outside the body or inside 
the brain. However* it should be noted that Bernstein labored under a major 
terminological manifestation of the classical view* ^ namely » the 
correspondencies of the terms "^sensory** and **afferent,* •»motor*» and **effer- 
ent.** This was unfortuante given that he rejected the conceptual Identity 
that these correspondences Implied. 

As Gibson (1966, 1979, 1982) and Reed (1982a) have ably argued, the 
psychological concepts of sensory and motor uannot be equated, ^Respectively, 
with the anatomical structures termed afferent and efferent. The anatomical 
definition of the sensory system (as receptor elements, cortex, and the affer- 
ent pathways that mediate them) falls to accommodate the adjusting, bptimlz-^ 
Ing, steering, and symmetrlcalizlng of 3ense organs**^hat is, their purposive 
activity (Ctbson, 1966). Bernstein recognized this Inadequacy. In referring 
to the systematic searching by sense organs, he wrote (p* 117): **Thls id an 
entirely active process, and the effector side of the organism is here em- 
ployed in a manner coiiv>letely analogous to that which is later explained to 
underlie afferentation in the control of iDovements.** The ^atomlcal deflnl* 
tlon of motor system (as cortex, motoneurons, and the efferent pathways that 
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mediate them) falis to aceonwodate the dynamic re3pon^vene33 of effector or- 
gans to changes in the external fo^'ce field brought about by changes in the 
orientation of effectors to the surr^ound aUd, to the body— that ^Is, their 
contjextual senisltlvtty (TOrvey, Shaw, & Mace, 1978). More than anybody before 
him, Bernstein 'sought to^ substitute the analysis of action In terms of effer^ 
ent coromaods from cortex to OE!otoneuron& by an analysis of action as the 
selective use of InforoLitlon about the environment and about one's- movesients 
to selectl vely modulate one *s movements with respect to the environment 
(cf. Gibson, 1979)* 

Second^ th<! trac^ltlon of regarding afferents as triggering signals en- 
forced. In Bernsteln/s view, a general attitude toward cfference as arbitrari- 
ly related to the envlronnental conditions that cause it. All that Is re- 
- quired for the successful Initiation of a reflex la afferentatlonHhat is con- 
stant and ^ recognizable by .the effector apparatus. The proxlm&l cause of a 
reaction need bear no necessary relation to the distal cause. As Bernstein 
sees it, this idea of an arbitrary connection between afferent states of 
affairs and environmental states of^affalrs is pernicious* If the afferent 
(or sensory) codes are arbitrary (as is claimed by Mailer*s Doctrine of 
Specific Nerve Energies and its successors) and if what the anldial perceives' 
is based on these codes, then what is there to guarantee that the animal's 
perception is objective and accurate? The depth of Bernstein *s concern is ex- 
pressed in this quotation (p. 126): ****.from the fact that it la clearly 
possible to reconcile the perfect operation of reflex functions ^rith the^ com- 
plete arbitrariness of their sensory codes it is very easy to slide from the 
position of the recognition of the symbolic nature of all reception in gener- 
al, and of the condltionallty of the picture of the world in t^e brain, and the 
psyche, to the concept of unknowability of objective reality and similar 
idealistic conceptions*..** 

2.1 T ]h .e, Cartesian Program 

' The orthodox and very popular representational /coDq>utational approach to 
bind (see Chomsky, 1980; Fodor, 19T5J Pylyshyn, 1980) is consistent with the 
arbitrary coding theme that Bernstein brieves (incorrectly, as we will claim 
below) to be rooted in the reflex philosophy and methodology. The representa- 
tion^il /computation: 1 view abides by a **formality cOndltion***-the explicit 
understanding that mental operations are formal, symbol manipulations per- 
formed on formal, symbol structures (Fodor, 1980)* To a computer (and, by 
analogy, to a brain) it is immaterial whether its Internal codes refer to this 
or iiiat fact; how the signals are formatt^ and' how they relate consistently 
among themselves by rule are what matters,, not their meaningful content. We 
raise the spectre of the formality condition for two reasons. One reason l9 
that Bernstein, despite his dislike of this condition in the guise that was 
famllar to hio, inv6kes a mechanism for the control of activity that is 
continuous with the representational /computational thesis and, therefore, with 
the formality condition. Bernstein suggests* that an ordered sequence of set 
polnts- ^epresentations of required values — governs the flow of afference and 
efference within the acting animal* The ordered sequence is a program 
prescribing the general form of the activity; it is a representation of the' 
activity for .the effector organs (cf* Cummins, 1977; Shaw, Turvey, & Mace, 
1982)* 
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The other reason Is that the formality condition Is clearly tied to the 
, historical tradition that began with the Cartesian Doctrine «>f Corporeal 
Ideas. It is this tradition that encourages the arbitrary coding ltr:erpreta« 
tlon of afference» not the r3flex arc methodology^ which Is itself a restate* 
ment of the Ccrteslan doctrine (Reed» I962b). Descartes* cfoctrlne, stated 
very generally^ Is- that all awarenesses are awarenesses of states of the body 
-or» as we wot^ld be more prore to say today» states of the brain. In contenpo-* 
rary thought* It Is said that direct access to environmental and it>ehavioral 
states of affairs Is limited to the physical (or bodily) outputs dV transduc* 
ers that are linked hpt to the env^onmental and behavl6ral states* but to the 
basic energy variables* e.g., Ihlenslty and wavelengths of light (Boynton* t 
1975; FodoV^^ Pylyshyn*. 1981). The question that this doctrine poses has 
been at the base of almost all theories In psychology* via.: *How can the 
environment ' be known objectively and accurately and acted ugon s uccessfully 
when the Ideas pne has about s uc h things are b aae d gn awarenesses merely of 
brain states? Descartes had an answer to the subsidiary question of how prl** 
marv objective qualities might, be derived from secondary subjective qualltlei^ 
m^. It h^s been a persistent- Ingredient In almost all subsequent theorizing. 
He assumed an act of understanding that pasfised Judgment on what envlronmertal 
things might have caused the brain state; In his best. Known exaiq>lc» he as« 
sumed a ru}e^overned» quaslHoathematlcal process of Inference from tho s\.dtes 
of the eye muscles and the visual nerves to tl>e ^Istance of environmental 
Object. * . - ' 

We c::m now focus sharply on the full loipllcatlons of Bernstein's Innocent 
claim that the coordination of animal and Its environment must be baset^ o*i 
objective and accurate facts. Because of the pervasiveness of the Cartesian 
doctrine in physiology* psychology and cognitive science (see Re&d» \982b;* 
Shaw et'al.i 1983)» It Is generally accepted that' an animal's awareness of Its 
activities and of the sur'face layout to which they refer Is not direct but 
mediated. Descartes had prc^osed rulc:s» Inferences and Judgments to Set to 
these objective facts of aotfvl*;y and environment .from the directly given, 
subjective braJn st.^te$. To Descartes* list of coG^ltlve or eplstemic I media- 
tora, later theorists have"^added representations, schemas» programs, models^ 
organizing principles, meanings^ concepts^ and the like. Whether dressed In 
lis traditional or modern garb» the Cartesian program for explaining how 
felicitous activity Is achieved In a cluttered environment faces a profound 
predicament. There Is nothing In this explanation to guarantee tjiat tf;e gro** 
posed Inferential operations performed o n the brain states will y_ leld conc^u* * 
slons that are objective and accurate rather than fatuous . In responding to 
John Lookers version of the Cartesian program, Berkeley thought a guarantee 
was unwarranted and emphasized the phenomenalism (that there are only phenome* 
nal objects such as ideas) implicit in the Cartesian program. Hume thought a 
guar^antee was unlikely to be forthcoming and emphasized the skepticism (tbat 
tbere may well be an environment and activities ociehted to it^ but -no one can 
be ' sure of their existence) implicit in the Cartesian program. It is to 
thoughts such as those expressed by Berkeley and Hume that Bernstein refers 
when he remarks pn **...the concept of unknowablllty of objective Veallty and 
similar idealistic conceptions.,.** Bernstein (p. 125) goes on to say, (too 
cavalierly » in our opinion) that such thoughts .*h£ve been disproved by , 
authentic science long ago.** As scientists coamitted^'to an objective ^eall^ 
ty» we must claim that it is knowable by anlmal?» more or less. However^ a 
scientific account of perception that is consistent wHh this realist posture 
has been thwarted^ ]/in our vlew» by the almost universal aticeptance of the 
Cartesian program. As long as the Cartesian program is the accepted strategy 
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for explaining the coordination of an animal and Its envlroninent-^s long as 
the awai^eness of surface layout aod action Is claimed to be cognltlvely 
mediated — then the thoughts of Berkeley and Hum cannot be dismissed ca'valler- 
ly and the predicament Identified above remains firmly entrenched ill psycho* 
logical and physiological theory. 

Accurate* objective conclusions might be assured If the Inferential 
operations (and the various cognitive entitles such as representations* etc.) 
were tightly constrained by reality. But the Cartesian program defiles an anl* 
mal direct contact with reality; to reiterate, only brain states are directly 
contaotable. The problem for the Cartesian program, therefore* is how to S^^ 
the reality that bears on the felicitous control of activity into the mind or 
nervous system of the animal. There are several responses to this problem. 
The most popular response is that ^ model of reality is constructed by a 4>ro'^* 
ess of justifying inferences In the course of elttter evolution or ontogeny? or 
both (Bernstein advances a solution of this type). We will briefly summarize 
some of ^ the reasoa^^ that render this response (scientifically) unacceptable 
(see Shaw et al.* 1982; Turvey* Shaw* Reedt 1 Hace* 198I* for a fuller 
discussion). * ' 

All forms of .non-demonstrative Inference proposed by Inductive logl» 
clan*— enumeratlve Inference* ellmlnatlve Inference* and abductlve Infer* 
ence*— can be expressed as a confirmatory relation between evidence and hypoth* 
esls. The conditions of adequacy for confirmation vary among the forms of 
Inference (see Smokier, 1968)* but this Is immaterial to the points we wish 
make* viz.* that ttie Very notion of inference requires (1) the ability, to 
project relevant hypotheses and (2) the availability of predicates In which to 
frame evidence statements and hypotheses. To clarify* the notion of & basic 
set of hypotheses Is explicit In ellmlnatlve and abductlve Inference and im* 
plied In enumeratlve Inference. For example* one vsrslon of abduction (Han* 
son* 1958t p. 72) goes as follows: 

Some surprising phenomenon P Is observed. 

IP would be explicable as a matter of course. If H were true. 

Hencev there Is reason to thlnK that H is true. 

If a mode] of reality were derfved from ^Inference* then It would have t<j be 
supposed vhat appropriate hypotheses-hypotheses ^that' wei e generalizations 
about envl^nmental states of. afi'alrs*-were already at t'le disposal of pht an* 
Imal. ' What Is their origin,? Surely the answer cnnno*. be "Inference,** for 
that would precipitate a ^vlclous regress. But' If the answer -Is not ^'Infer* 
<^nce,** then the only optj^on for the Cartesian program^.ls Jthat the origin of 
the hypotheses Is both extra-^hyslcal and extra^concfeptual. These are mutual- 
ly exclusive categories. ' c 

The same conclusion follows from ti\e point about the availability of two 
kinds of predicates* those for framing evidence statements and those for fr^m* 
Ing hypotheses. The predicates In an evidence statement (the outputs of the 
transducers) stand for energy variables and* by argument* have their origin In 
physical processes. But for' any form of Inference there rmjst' be available* 
concurrently* predicates in which to couch hypotheses and these must be 
predicates that stand for envlroRmental properties («uch as an. obstacle to 
locomotion).. The oVlgln of these environment-referential predicates cannot be 
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Inferential^ otherwise the argument Is regressive* and,lt cannot be physical 
(lawobased) because that option Is denied the Cartesian program^ by^ defini- 
tion. , ' . 

The general conclusion to be drawn is that a reliance on Inference takes 
out a loan of Intelligence that science can never repay: The Cartesian pro- 
gram Is not a scientifically tractable program, and a fortiori* <s ^ program 
for perception that science would be 111-advlsed to pursue. 

2.2 Gibson's Ecological Program * 

We believe^ that the, Cartesian program mst be abandoned If a sclentlfl** 
cally acceptable account Is to be provided of^the perceptual objectivity that 
Bernstein regards as the slne ^'qua non of action. To ease the break .with 
tradition* It may help to remember that Descartes built his perceptual theory 
around thought, not action. c;bson*s (1979) is an approadt to perceiving that 
takes the control of activity as its central concern. In this approa<jh the 
Cart'^slan doctrine of corporeal Ideas Is reJccted^ together with the many 
perplexities that it entails. Rather than foundini^ perceptual theory on brain 
states that ^are related tenuously to the environments and activities of 
anlmalSt Gibson founds perceptual theory on structured energy distributions 
that are lawfully related to the environments and actions of ahlmals. Rather 
than asklcg how accurate objective Inferences from brain states to the facts 
of environments and actions are made, Gibson asks how Information specific to 
the facts of environments and actions Is detected. Rather than assuming that 
the conventional variables of i^yslcs provide the only legitimate basis for 
describings the environment, Gibson advances the Idea that the environment can 
be legitimately described In terms that are referential of the activity 
capabilities of animals. 

It will not be. possible for us to do complete ^atlce to Gibson *3 
perceptual theory In these pages (see Gibson, 19791 Klchaela & Carello, 1981; 
Reed & Jones, 1982; Turvey et al., 1981** Turvey 4, Carello, 1981). We will 
restrict ourselves, therefore, to 'those Glbsonlan concepts that we take to be 
moat central" to the control of actlvlty^the concepts of information *and 
affordance. And we will restrict ourselvea to the perceptual system of great- 
est relevance^the visual perceptual system. 

Information Is optical structure generated in a lawful way by environmen- 
tal structure (for exan^le, surface layout) and by the movements of the anl* 
mal, both the iriovements of Its body parts relative to Its body zn6 the moye-^ 
ments of Its body as a unit relative to the environment. ^Thls optical struc** 
ture does not resestble the sources that generate It, but Is specific to those 
sources. sense that It Is nomically (lawfully) dependent on them. The 

claim Is that there are laws at the ecological spale that relate optical 
structure to properties of the environment and action (Gibson, ' 1979* Turvey 
et al., 1981). 

This treatment of Information and the notion of ecological laws rests on 
an optical analysis that departs from the classical geoi^trlc ray optics' and 
the more contemporary physical optics. Though some have 'argued to the con-^ 
trary (e.g. ; BOynton, 19751 Johansson, 1970), neither of these analyses Is 
sufficient to captui^ the richness of light's structure 5ubse<)uent to multiple 
reflectlofis from surfaces of varying Inclination and substance and underjsolng 
various types of change. Gibson's push 'has been for a theory of optics that 
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can do Justice to atnbient Me^*t a basis for the control of activity. Given 
that activity is at 'he ecological scale of animals and their environoients^ 
Gibson termed the sought-after optical theory ecological optics . The limita- 
tions of conventional optical analyses recognized by Gibson ^979) are 
echoed by' illumination engineers (e.g.» Gershun* 1939; Hoon» ^9t>^; Hoon h 
St^encer, 1961) whose goals' are ouch piore modest than Gibson^s. In the subsec- 
tion that follows we consider the activity-relevant questions raised by Figure 
1 in terms of Gibson's ecological optics. 

2.2. 1 ggw does the animal know that U is moving forward ? 

Forward rectilinear motion of. a point of observaticJn relative to the 
surroundings will lawfully generate an expanding optical flow pattern globally 
defined over the entire optic array to the point or observation.- [A locally 
defined expansion pattern* kinematically discontinuous at its borders with the 
optical structure in the large* would be lawfully determined by a part of the 
surround moving relative to the point of observation. ^In natural circum* 
stances there can be no ^o^iguityv contrary to the standard claim (von Hoist 
Hittelstldt* 1950), about'what is movij)g-*the animal or part of its environ- 
ment '(TUrvey* 1979).] As noted above, ^he lawfulness of optical structure at 
the-^ecological scale is the basis for its functioning as information for the 
control of activity: If A lawfully generates B* then B specifies A. Lishman 
and Lee. (1973) have shown that humans walking voluntarily forward will report 
that they are walking backwards when exposed to the global optical transforma- 
tion that is lawfully generated by backward locomotion (and which» therefore^ 
specifies that the walker is moving backward). Further, when flying insects 
are exposed to global optical transformaiions that are the Itfwful consequences 
of forces tnat produ'^e rotation* vertical displacement, and yaw* they respond 
with the appropriate counteracting forces (Srinivasan* 1977; Turvey & ReiflfeZt 
1979). 

2.2.2 lion does the animal know from where to Jump (to accommodate an upcoming 
barrier ) and How does it know whether its deceleration is ad equate ( tp 
accottttodate its locomotion to grj, upcorolnB brink in the ground )? 

The answers to both of these' questions depend> in the Gibsonian perspec- 
tive, on information about the imminence of contact (with barrier or brink). 
Lee (1974, 1976, 1980) and others (e.g. » Koenderlnk h van Doorn» 1981) have 
identified an optical variable^ symbolized as T(t) by Lee (1976)» that is 
equal to the inverse of the rate of dilation of a bounded region of optical 
structure. T(t) Is lawfully generated by the approach at constant velocity of 
a point of observation to a substantial surface In the frontal plane» or vice 
versa; it specifies the time at which the point of observation will make con- 
tact with the surface. 



Obvious'ly^ the existence of an optical variable specifying time-to*<on*; 
tact bears directly on the question posed by Bernstein (Question ^ above) of 
how control can be prospective. An> answer to the question of prospective 
control is constrained by the requirements that (1) causes precede effects and 
(2) causes be actual rather than possible states of affairs. An event at a 
later time cannot cause an action at an earlier tlM and only actual events 
can be causal. In the Cartesian program* the bases of pnospective control are 
representations; actual mental states existing In the present (rather than 
future^ possible states of the animal ^environment system) are the causes of 
activity. The logical format of these representations in the case of con^^ 
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trolled collittiona ttiat be that of a counter-factual, 'roughly of the form "if 
I don *t change what I an doing and the conditions continue to be as they aret 
then X is likely to occur.** < The basis bf prospective control in the 
^ibsonian program is exemplified by the time-io^ontact varLabley Viz.« their g 
is infor nation ip the present optical structure (e.g. * the va}ue of ^ (t) at 
ti) specific to wha t will occur ^/ the present conditions continue (e.g. ■# 
collision at time ti). To draw the contrast sharplyv in the Gibson ian program 
the basis for prospective control is sought in 'laws &t, ttie ecological scale 
(that relate preseht optical properties to upcoming properties of the ani- 
^lal^^nvironment system); in the Cartesian program the basiA is sought in 
inferential processes (that relate the semantically neutral outputs of trans- 
ducer^ to 3 counterfactual representation). Reiterating the arguments raised 
above* the Cartesian solution to the problem of -prospective control begs the 
interesting questions; for example, how does the animal construct Just that 
counterfactual representation that is right for the current situiftion? 

Let us look at an etanpl^ of the use of the tiBie*to-<ontact variable. 
The gannetv a Idrge seabjrd tLdt feeds on fisHv hovers about thirty Mters 
above the water. On lighting' a prey, it dives down first with its wings p^rt-* 
ly spread for steering and then with its wings folded so that it enters the 
water vigorously but cleanly. It may hit the water at speed approaching 25 
ms-1 (or 55 miles h-*1). The action problem for the gannet is to retract its 
wings 300i> enough to a'«oid Cracikuring tfiem but not so soon 3s to hinder the 
accuracy of tts dive.' Given that the gannet dives from varying heights* at 
varying initial speedst and in Varying wind con4itionSt how does it properly 
control its ehtry? Lee (1980) and Lee and Reddish (1981) have concluded that 
King retraction is initiated when the tiM<<0"Contact variable reaches a cer^ 
tain icargin value. (Because the anio^l is constantly accelerating due to 
gravity in the div'^v the same margin value of T(t) will be associated with 
different actual times-to-contact. The birds are seen' to fold their wings a 
longer time before contact the higher the starting point of the dive.) 

There is reason to believe that the time-to-^ontact varl|)^le is the basis 
of prospective control in a number of related circumstances. Data on the 
kinematics of catching a ball tSharp & "WhltinS* i974, 1975)« hitting a base- 
ball (Hubbard & Seng, infants reaching for a moving object (von Hofsten 
& Lindhagen, 1979; von Hofsten, 1983)^ stepping down (Freedman, tfannftedt, & 
Herman, l976)t and falling oQ one*s hands against an inclined board (Dietz & 
Noth, 1978) are more or less amenable to such an analysis (see Fitch, Tuller, 
& Turvey, 1982; Fitch & Turvey; 1978; Lee, i980). The last situation is 
depicted in Figure 2. The triceps brachii muscles are shown to tense in prep- 
aration for an upcoming collisior in which the arms musi absorb the momentum. 
With the eyes closed, *the electromyographic index of the initiation of muscle 
tensioh is tied to the start of falling; with the eyes open and with differ-* 
ent falling distances the index occurs at varying times after the start of 
falling but at an approximately constant time prior to contact. 

We should remark that the fact of a simple, single o|jtical property 
specifying the iaMDinence of contact has in^lications for another of Brrn- 
stein^s concerns, naoelyt how an animal can adjust its behavior to the veloci- 
ty of things. Bernstein pursues a conventional argument that velocity is ar- 
rived at by a process of coo^aring the present location of ^ thing with the 
memory trace of an immediately preceding location and dividing the deduced 
distanoe traveled by an inl^ernally determined estimate of elapsed time. The 
inadequacies of this kind of explanation have been discussed in detail (Gib- 
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Figure 2. With the eyea open and with different falling distances the initia-^ 
tion of tension in the triceps brachii luscles occurs at varying 
times after the start of falling but at an approximately constant 
time prior to contact (Above). With the eyes closed initiation of 
muscle tension Is tied to the start of falling (Below). (From 
Fitchf H. L.p Tuller^ B.» & Turvey^ M. T. (1982). The Bernstein 
perspective: III. Tuning of coordinatlve structures with special 
reference to perception. In J. A. S. Kelso (Ed. )» Human motor be* ' 
havlor (p. 278). Hillsdale, NJ: Erlbaum.) 
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3on» 1979; Turvey, ^977). Here we wish to coiment only on the questlonat>le 
strttegy of tntlyzlng hlgher^rder activity-relevant variables In terms of the 
putatlvely more basic variables of dlsplacejoent and time. Inertlal guidance 
systems are based on Newton's laws of Inertia and cravlty. These systems de- 
tect accelerative forces directly. They determine velocity and distance 
Indirectly through the single and double Integration, respectively, of the 
accelerative forces. In like fashion, adherents to the Glbsonlan program 
(lee, 1960; ituneson, 1977; Turvey & Shaw, 1979) argue that the Imminence of 
collision 1« not Inferred from • preliminary determination of speed of ap- 
proach and distance from surface; rather, the basis for an animal's knowing 
when • surface will be contacted is the detection of T(t) ts^ such . The point 
Is that to understand how perception controls activity, we mst be willing (1) 
to question the primary reality status of the basic variables of physics; and 
(11) to look for variables (observables, quantltltles) at the ecological scale 
that uniquely specify the relation of animal to environment; and (111) to ^ 
consider hard* or soft*<Aolded processes that detect these ecologloal variables 
(rather than knowledge-based procedures that construct representations of them 
from conventional physical variables). t 

So, how does the animal know from where to leap? The answer, to be 
blunt, Is that It does not need to know the proper place; rather, It needs to 
know the proper time. Thg former depends explicitly on the speed, the Jitter 
does not. Evidently, as anticipated, the successful leaping of • barrier 
depends on the tlme-to-contact variable. It also depends on body-scaled 
Information, but we will have more to say about that below. And how does the 
animal know whether It Is braking sufficiently? An animal's deceleration is 
adequate If and only If the distance It will take the animal to stop Is less 
than or equal to Its current distance from the brink (Lee, I960). Adequacy of 
braking Is specified by whether the rate of change of T(t) equals or exceeds a 
<irltlcal value (Lee, 1976; I960). A related observation is that flies begin 
to decelerate prior to contact with a surface at a critical value of T(t)-l 
(Wagner, 1962). 

2.2.3 Hgw ( to e s the animal k pfl w that the barrier Is juroable and that t he 
brink is^ a step^down p_ lace (rather than a falllng^ff place)? 

Knowing that scathing is in the class of Jutnpable objects and that some 
other thing Is In the class of step-down places would be treated Jin the Carte- 
sian program as the Imposition of subjective, meaningful categories on an 
objective, meaningless ?^urround. Conventionally, It would be said that the 
animal has concepts of such things and debate would focus on how such mental 
entitles could be established. Careful analysis would reveal that, given the 
departure point of the Cartesian program, , empirical contributions to such 
concepts would have to be secondary to the rational contribution (Fodor, 
1975). In sharp contrast, the Glbsonlan program seeks to uncover t natural, 
lawful basis for knowing what activity (or activities) a situation offers. 
Consider a brink In the surface that happens to be a step-down place for • 
given animal rather than t place where It would have to Jump down or climb 
down or steer away from. To begin with, the property of the brink as a 
step'-down place for the animal cannot be captured in the scales and standard 
units of physics. These scales and units are Intended to be "fully 
objective," that Is, observer- or user-Independent. They are extrinsic meas- 
ures, in that the standards on which they are based are divorced from and 
external to the situations to which they are applied. To capture a step-down 
place for a given animal requires intrinsic measures, those whose standards 
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are to be f<Hind In the situation of animal and brink. In Figure 3 the separa-*^ 
tlon of surfaces (R) muat somehow be expressed In units of the animal* Leg 
length la obviously significant but scaling surface magnitudes In terms of the 
unit *eye-helght* is probably a better move (cf. Lee, 1974, 1980). A lawful 
allpnetrlc relation (Gunther, 1975; Huxley, 1932; Rosen, 1967) Is to be ^ 
expected between eye height (E) and leg length (L): L s a£^t where a and b 
are^constants* (Eye height will, of course, vary with the animal's posture, 
but our Intent here Is to conve> t;he style of the analysis rather than Its 
full detail). If the separation of surfaces (R) at a brink is below some 
critical nuii«>er, nE (or is less than or within a tolerance range nE +6), then 
the separation is a step-dnwn place; above this critical nuniber (or range) It 
is a place that requires some loconotory strategy other than stepping down. 
Noting that E is unity, there is a dlmenslonless quantity that narks the 
^boundary between the actlvltles«-atepplng down vs. Junplng down* for exam* 
ple«-that a^brink offers an animal. Now the question becomes Whether or not 
there la an optical property specific to this dlmenslonles quantity. 

First* a point of observation moving toward a brink In a surface (where 
one surface partially occludes another) will lawfully generate an optical flow 
pattern In which there is a discontinuity, viz., a horizontal contour above 
which optical structure magnifies and gains and below which optical structure 
magnifies but does not gain. The Mn-galn and gain of structure are specific, 
respectively, to the occluding surface currently supporting the animal snd the 
occluded surface to which it is heading. Seconrdi from Figure U (after Warren* 
1982) it can be seen that^ the separation (R) of the occluding and occluded 
surfaces can be expressed in units of the height of the point of observation £ 
and in terms of the ratio of the rate of displacement of the point of observa- 
tion (dx/dt) to the rate of gain of structure (ds/dt). Letting ^^-t^ be the 
time of a step (x), for the same stepping rate the gain of structure (s) is 
greater the greater the separation of surfaces (R). Although the exaiq>le is 
crudely developed* it makes an in principle argument that there will be a dl- 
menslonless quantity of optical flow* such as (dx/dt )/(ds/dt )» which Is 
specific to the vertical separation of surfaces at a brink » scaled in units of 
the observer* A critical value or range of this optical flew property will 
Specify the boundary between places the animal can step down from-^those that 
can be acconmiodated by limb extenslon«-and places requiring a different 
maneuver* 

Dlmenslonless nunibers play a significant role in many branches of phys- 
ics. The coomonly used numbers* referred to as principal numbers by Schur- 
Ing (1980) (of which the Reynolds* Raleigh* Kach, Prandtl, and Froude are 
prime exan^les)* are built from laws. Thus, the Reynolds number* which ap- 
plies to fluids, is built from Newton's law of inertia and the law for shear 
stress of a Newtonian fluid, ^ The two laws are cast as dlmenslonless ratios or 
7T nuoAters (e*g** ^ s F/ma), and these two nuiit>ers in ratio give the 
Reynolds number* At a critical value of the Reynolds number* the Inertlal 
forces (favoring turbulence) dominate the viscous forces (favoring laminar 
flow) and there is a shift from the one kind of flow to the other. Generally 
speaking* the ma^r dlmenslonless numbers In phy^ ics mark off, at critical 
values* a change in the relation of forces from a balance between them to a 
dominance by one of them and, thereby, mark off distinct physical states* In 
like fashion* it seems that the dlmenslonless numbers built from purely opti- 
cal variables mark off* at critical values, distinct states. They are not 
physical states associated with distinct forms of energy absorption* however, 
but speclflcatlonal states (see Kugler* TUrvey* Carello, 4 Shaw, in press)* 
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Figure 3t Approaching a brink of a surface, 
and R l5 the surface separation^ 



E is eye height i L Is leg length 





s Figure 4. In approaching a brink of a surface the ratio of dX/dt/dS/dt 
depends on the 3urface separation R relative to the eye height £• 
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Thusi In the exaiq>le just glven» the dlmenslonless <)uantlty (dx/dt )/(d3/dt ) 
specifies step-downable when It Is below a critical value and non^tep-down- 
able when It Is above that critical value. 

Ke wish to anderscore with two we Undeveloped exan^les the potential sig- 
nificance of dlmenslonless quantities to law*based explanations of the control 
of activity. Khen dT(t)/dt > *0.5» It specifies that the point of observation 
will stop prior to contact with an upcoming substantial surface If current 
conditions persist* whereas when dT<t)/dt < *0.5» It specifies that there will 
be a collision between the point of observation and the surface If the current 
conditions persist. This critical value of the >rate of change of the 
tlme^to-contact variable la an Invariant optical <)uantlty: Whether the animal 
Is approaching a surface or being approached by a surface* the quantity *0.5 
marks off two distinct speclflcatlonal ^ states concerning the colllslonal 
conae'iuences of the animal's current activity. 

The second exanqple returns the tows of this subsection to the perception 
of the kind of activity that an arrangement of surfaces affords an animal. 
Warren ( 1982) investigated the perception of stairways that varied In riser 
height In terms of two questions: (1). Could a person perceive whether a 
stairway was cllmbable In the normal fashion (a question of the critical riser 
height)? and (2) Could a person perceive how costly* In metabolic terms, a 
stairway would be to clln^ (a question of the optimal riser height )? A 
preliminary analysis of the biomechanics of stepping up revealed that the 
riser height (B) beyond which normal stair climbing would be iDiposslble was a 
constant proportion of l«g length (L)» vlz.> .88L» or B/L (a dlmenslonless 
quantity) = .68. Subjects* who differed markedly In height (5*4" vs. 6*4"), 
saw photographs of stairways with risers that ranged between 20 in and 40 in 
and were asked to Judge the cllmbableness of each stairway. Although the 
riser height that distinguished the stairways judged to be cllmbable from the 
stairways Judged to be noncllnbable differed between the two groups of sub* 
Jects when measured In Inches* It did not differ when measured In leg length. 
For both groups of subjects B/L = .88» that is, the critical riser value 
that had been determined from blomechanlcal considerations. Kith respect to 
the optimal riser helglit» the metabolic cost of climbing at 50 steps/mln on an 
adjustable* motor-drlvfn stalrmlll was evaluated at plser heights varying from 
5 to 10 In for short (5'4») and tall (6'4") subjects. The minimum energy ex- 
penditure per vertical meter (oalAgHu)* Indicating optimal riser height* oc* 
curred at a riser height of B = .26L. In two visual tasks* a forced choice 
task and a rating task* the stairways were pitted against each other In pairs. 
The tasks revealed that the preferred riser height (the stairway that was seen 
to be the one that could be climbed most comfortably) differed beti#een the two 
groups of subjects when measured In Inches but It did not differ when measured 
In leg length. The preferred or optimal value for both groups i#as .25L In the 
forced Choice task and .24L in the rating task* very close to the optimal val- 
ue of .26L determined by metabolic measurement. 



2.2.4 Af for dances 

In Gibson's (1979) Urminology* step-down places, falllng-off places, . 
cllmbable-placesi colllde-wlthable surfaces » travel^hroughable openings and 
so oji (Figure 1) are affordances. That is to say, they are properties of the 
environment taken i#lth reference to the animal. An aflordance Is an Invariant 
' arrangement of surface /substance properties that permits a given animal a 
particular activity. It is a real property— one might even say a physical 

ERIC "° 22j 

/ 



Turvey & Kugler: An Ecological Approach to Perception and Action 



property«-but one that I5 defined at the ecological scale of animals and their 
niches. By the laws' of ecological optics, the light structured by an 
affordance ivlll be specific to the affords nc^-^s the above exao^les suggest. 
The optical property specific to an affordance is like the Mme-to-contact 
property: It Is not decomposable into optical variables of a putatlvely wore 
basic type. Consequently, It is claimed, the perceiving of an affordance^ls 
tased on detecting the o^lcal property that specifies the affordance. In the 
Glbsonlan program» perceiving an affordance Is not mediated by coDfiutatlon* 
al/representatlonal processes. It is^ said to be direct, and understanding how 
this can be~understandlng the physical processes at the ecological scale that 
make possible the direct perception of the reality that bears on the control 
of actlvlty«-ls what the Glbsonlaii'HIM:x>gram Is fundamentally about (Section^ 



3.0 Principles of Self-Repilatlon 

It Is fair to say that In working under the Cartesian program one Is In* 
cllned to explain regularity (of activity) by reference to Intelligent regula- 
tors. In the Cartesian view of things, It Is an act of the intellect that 
Interprets the outputs of sensory transducers and puts them to use with re- 
spect to externally oriented desires. Intelligence In Its various manifesta- 
tions (e.g. , judging, ccffltpt'^^ndlngi decision makings coiq)arlng» projecting 
and evaluating hypotheses, recognizing^ reconsidering^ conmandlngi and soon) 
is at the core of the Cartesian explanation of the control and coordination of 
movement. For Descartes himself the Intellect was equated with the soul— or 
as Ryle (19^9) liked to say, disdainfully, Uhe ghost In the machine.** 

The conteiq>orary student of movement who chides all Ulttle man in the 
brain' explanations of control may, however, be firm In the belief that 
concepts borrowed from cybernetics and formal machine theory are acceptable 
explanatory tools. Personally, we think such convictions are suspect. 
Concepts such a^t set-points, programs, and so on are superficially attractive 
In that they refer to material things that perform the role historically as- 
cribed to homuncull. Under closer scrutiny^ such concepts are revealed to be 
the products ,of an Intelligent act performed by a being with foreknowledge of 
the regularity to be achieved. The concepts of cybernetics and formal machine 
theory are seductive because they facilitate the simulation of 'regularities** 
but they are not, we believe, In the best Interests of explanatory science. 
First, these concepts necessarily assume Intelligence and rationality— assuBV>- 
tlons that were» after all, the reason for science's original and persistent 
displeasure with Descartes' homunculus. Second, their promise Is limited, at 
best, to describing and, perhaps, to predicting regularities. But explaining^ 
in the sense of Identifying the lawful basis for behavior, is Ineffably beyond 
their reach. 

At one time, Bernstein was enthusiastic about the relevance of cybernetl- ^ 
cal and formal machine analogues to the physiology of activity. He* later be- 
came much more circumspect with regard to their appropriateness. Cybernetlcal 
notions figure prominently In his discussion of "^Some Emergent Problems of the 
Regulation of Motor Acts'* (as we will underscore In the subsections that fol- 
low). But In later chapters he questions the propriety of cybernetics for 
biology and physiology and Intimates that '^the '^honeymoon'* between these two 
sciences'* (p. 181) may be over (also Pp. 185-186). In Section 3.1 we crltl* 
cally evaluate the cybernetlcal treatment of Bernstein's regulatory notion of 
circular causality and In Section 3*2 we outline the physical conditions for 
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that principle. Our'belief» consonant * with Bernstein *3 later ifl^reasionst is 
that the physiology of activity would fare better married to a physics that 
addresses the ecological scale and its jsmzuvBl regularities rather than to a 
formal theory of the regulation of artifacts. 

3.1 The Ring Principle ( Cybernetioally Interpreted ) 

Bernstein is convinced^ and properly so» that self-regulation is based on 
circular causality—the "ring principle" as he terms it. He eid)races the fa- 
miliar interpretation of this principle^ the one advanced by cybernetics: A 
referent rignal or set point mediates siffinalar fed forward to and fed back from 
a device or process (generically referred to as 'ihe plant*). For the conduct 
of an activity a single referent^-^and* a fortiori^ a single ring*^ill rarely 
be sufficient. Bernstein assumes an ordered sequence of referent signals. 
Insofar as a referent signal oust predate thf: afferent and efferent floW that 
it mediates* so the order of the referent signals must largely be ascribed 
prefatory to the activity* In brief » Bernstein *s proposal for 5ielf-regulation 
is the popular notion of a program. HacKay (1980) identifi >e kinds of de- 
tail one night expect to find in a program for a step cyoie of locomotion 
(Figure 5). 

In addition to identifying (1) the general a priori prescriptive nature 
of the program* this^ example illustrates nicely that a program is (2) an ord- 
erly sequence of preferred quantities (e.g.» 100 ms)» (3) an orderly sequence 
of commands (e.g.^ stop extension) to the skeletornuscular machinery that re- 
alizes these quantities, and (Jf) an orderly sequence of symbol strings (the 
representational format for the quantities and the cooft^nds). It also 
illustrates a nore profound feature of the program conception: (5) that 
rate^dependent processe3 *-the irreversible thermodynaalcs and the mechanics of 
the skeletornuscular system— are coupled to and constrained by rate-independent 
structures— the syotwl strings. 

The centrality of the ring principle to self-regulation cannot be doubt* 
ed. (The reciprocity of locomotion and global optical transformations de- 
scribed in Section 2.0 is one exaDf>le of the principle's ubiquitous applica-^ 
tion . ) What can be doubted is whether the properties identi f ied in ( 1 ) 
through. (5) above are necessarily entailed by the prinoiple. 

3.1.1 Th£ concept of the re /er ent signal 

The aollwerts (required values^ set points) that have been used frequent- 
ly to 'explain* the stabilities of vegetative processes (thermoregulation^ 
respiration^ feeding^ drinking^ etc. ) ar^ more fictitious than real (e.g.» 
Friedman & Strieker^ 1976; Iberall» Weinberg^ & Schindler» l97i; Hitchell» 
Snellen^ & Atkihs» 1970; Werner» 1977). The observed stable quantities of 
vegetative processes (e.g.* human body Jtemperattjre of 37 degrees centigrade) 
are not prescribed values or goals playing a causal role. They are» more 
accurately^ resultant quantities* indexing a stable relation between indepen-^ 
dent proceases (force systems) defined over the same state variables (lberall» 
1978; Kugler» Kelso» & Turvey» 1980» 1982; ¥ates» l982b). As we like to put 
itf these so-called sollwerts are not a priori prescriptions for the system 
but a posteriori facts of the system's processes. 
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Figure 5, A progran formulation of a locomotory step cycle (Adapted from Mac- 
Kay, W. A, (1980), The motor program: Back to the conflgiuter. 
Trends In Meurosclence , 3, 97-100), 
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The experiments of Zavellshln and Tannenbauin (1968) are illuminating In 
this regard* They focussed on two respiratory variables — the resistance (r) 
of the air to Inspiration and the duration (d) of Inspiration. The function f 
relating d to r was Identified* A function F relating r to d was Imposed 
(Figure 6). Circular causality was thereby established. The value of d at 
which the respiratory system settled down was that value mutual to the two 
functions. If F was chosen to Intersect f at more than one value of d, then 
the system would settle at one of the mutual points or oscillate between them 
depending on the actual details. Figures similar to Figure 6 are to be seen 
In Mitchell et al. (1970) and Werner (1977) with reference to tenperature 
regulation* and In Guyton (1981) and Yates (l982a) with reference to the pres^ 
sure-flow relationships for blood circulation. 




r 

Figure 6. Circular causality defined over the respiratory variables r (resis- 
tance to Inspiration) and d (duration of Inspiration). 



Each of the aforementioned Instances of the ring principle or circular 
causality Involves t^o distinct pathways of Influence between two variables, x 
and y. The system in question must satisfy two Independent causal laws, one 
linking X to y and one linking y to x. The real equivalent of the point of 
Intersection of the two functions in the x by y coordinate space Is the 
equilibrium operating point of the system~the only point that satisfies both 
causal laws. The equilibrium point Is not frozen; It can be shifted by 
changing the system parameters (see Guyton t 1981! Mitchell et al. , 1970; 
Werner, 1977; Zavellshln & Tannenbaum, 1968)* In 3um, each of these In- 
stances of circular causality exemplifies a stable equilibrium state that Is 
achieved without the processes of measuring the Istwert of the bounded vari- 
able, comparing It to the sollwert , and ain>lifylng the difference to bring 
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^ about an action that reduces this difference. The proceaaea of meaaurementt 
feedback , amplification ! and comparison that Bernstein takes t o be the minimal 
requirements of selfHregulation are ^ot to be fojnd. 

3.1.2 Intensional descriptions and teleological explanation 

How general is an interpretation of self-regulation that does not 
implicate the conventional, intelligence-based, mediating me<^anisnts of 
cybernetics? Intuitively, a notion such as a referent signal or program, and 
the related processes of feedback and the like seem to be called for whenever 
¥e construct' a description of a system (S) such as: 

*S prefers (wants, desires, sefeks, etc.) G, » 

where G can be the value of a property of S, a property of a thing in S*s 
environment, an orientation of S to the layout of the environmenti and so on. 
A statement of the above kind is called in intensional context or description. 
Basically, it involves borrowing the property of one thing, G, to build a 
property of another thing, S, viz. ^prefers G.* What Is the states of the 
borrowed property G? 

Orthodoxy invariably interprets intensional description as license to 
ascribe concepts: To predicate of S the property 'prefers (wants; desires, 
seeks, etc.) G* is to ascribe to S the cfoncept of G. Similarly, in matters of 
perception, to say that *S can perceive step-down places* (Section i?.2.3) Is 
to say, by the orthodox interpretation, that S has a concept of step-dcwfi 
places. What is it about intensional description that invites conceptual 
ascription? Why should the convenience of describing a property of a thing ^S 
in terms of a thing G be translated into the claim that S possesses or 
embodies iri^ somg^ form the thing G7 Empirical considerations reveal that the 
intensional context 'biological system S prefers a body temperature of 37 de* 
grees centigrade' does not mean that the end-state of 37 degrees is encoded in 
the system's central nervous system and, relatedly, does .riot identify a rela** 
tion between S and a central nervous system representation of the quantity 37 
degrees centigrade. The lesson of this example Is twofold: First, intension* 
al description does not mandate conceptual ascription; and second* intension** 
al de3criptio.n esay simply be a way of referring indirectly to lawful proces** 
ses. It seems, therefore, that intensional description will invite concep^^ual 
ascription to the degree that a lawful basis for a given regularity is 
unexplored or indiscernibfe (Turvey et al., 1981). 

Let us turn away from, vegetative processes, such as temperature regular 
tion, to the more general case. . Consider the following ex?mple of a 
goal-directed activity, to be desi^ated L. A swiftly flying bird suddenly 
changes its posture* spreads its wingSf flaps them briefly, glides, flaps its 
wings a little more, and alights gently on a branch. A teleological descrip^ 
tion (Woodfield, 1976) of L reads: 

'S did B in order to do G', 

where ^ refers to the bird, B to the behavior and G to the goal of alighting 
on the branch with a minimal mutual transfer of momentum between bird and 
branch. (Kugler, Turvey, Carello, & Shaw/ in pres^). This teleological 
description of L can be expanded (after Woodfield, 1976) to make the iinplied 
internal conditions transparent, albeit in %entalese" (Fodor, 1975): 
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did B becauae S (1) uanted io do G and (11) believed 
that B uould lead to G«* 

Now ue have a teleologlcal explanation of 

It la v^ry l^>ortant to dlatlngulah a ring principle (or circular 
cauaallty) explanation of L from a teieologlcal explanation of The ring 
prlnclplcr takea G for granted and explalna how S gets to G^ By definition, a 
rlrg principle explanation conalsts not so of a alngle aentence of the 

type 'S did B (at a particular time* In a particular way^'etc. ) becauae..*' 
but conalata, rather, of a aet of aentencea describing oyclea of act* 
iQg-Peroelving-change (In the anlotal-envlronment relation)* The ring prlnol- 
ple explanation of L uould be In terms of the reciprocity of the bird *3 ap- 
proach and optical flow, with particular eiphaala on the deoeleratlvtf /orcea 
^aupplled by the bird with *reapect to maintaining the optical flow property 
dT(t)/dt within th^ range apeclfylng a ^aoft^ contact (aee Section 2*2*Z)* 
The teieologlcal explanation of I alao takes G for granted, but li explalna 
why S doea B. Thua» the two explanations are oo^>lemntary (Woodfield»*l976)t 
If the procesaea governed by ihe ring principle are viewed aa dynamical, then 
the atates of S (*wanted% 'believed*) are tantamount to field boundary condl- 
tlona on dynamical proceadea* The form of theae particular boundary cbndl* 
ti^ns la that of non^olonomic conatralnta (about which more will be aald be* 
low). . ^ 

Clearly, the intention (1) in the teieologlcal explanation of L above la 
Bernateln'a ^image of achievement' (Pribram, 1971) that conatralna zhe varia- 
tlona in the content of the belief (11) until G la done. Recall that, for 
Bernateln» where actlona are planned they are planned in terms of biological 
conaequencea (th«t is, in terms of how an activity will change the ^nl* 
mal-envlronment relation) and not in terms of the pattern of bodily movement 
(Problem 3 of the introduction)* But can the state picked out by the phrase 
'wanted , to do G* be Interpreted as an 'image* in the sense of an actually 
existing mental or n^ral representation of G? WoodfleKi (1976) cautions 
thusly t 

It la tempting to think of a goal as a concrete future event, and to 
think of the present desire as involving a conception of that future 
event, with the conception of , the goal being in some sense logically 
or ontologically derivative from the goal Itself. But this is th? 
wrong way round* A goal Just is^ the intentional objeqt of the rele- 
vant kind of conception, (p* 205) 

Let us see what a Glbsonian analysis of G looks like. The goal G in the 
goal-directed activity L Involves two aspects! one, a surface X that can sup- 
port S and two, a soft, feet-first collision with X. The former aspect de- 
fines an affordance, and under the Glbsonian program an affordance is optical- 
ly specified* That is» the light structured by a branch is specific to the 
support property of that surface layout vis-a-vis the bird's prcportlonst The 
latter aspect, that of the aoft collision, is specified by dT(t)/dt ^.5. 
The twc aspects of the intention ^wanted to do G* in L might be interpreted in 
the Glbsonian program as follows: ^wanted to do G* ij a matter of having 
detect'^d infcrmatlon that continuously specified a surface of support and hav* 
ing detected information that continuously specified the intensity of an 
up-doming collision uith that surface, on the occasion of a certain metabolic 
conditio:! of 3* 
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One should be circumspect about the generallxablllty of an analysis of 
the preceding type* Intentlonallty I3 a large issue and the reader *s favorlie 
ex&nple of Intentional behavior 19 probably nuch lore elaborate than ^* How- 
ever» states of affairs such as I* are connon; they comp^'ld^the larger part 
of an anliB£l*3 dally directed activities* Arx) Insofar as the Glbsonlan pro* 
gran can anchor teleologlcal explanations of goal^dlrected activities such as 
L In natural laws at the ecological scale, it promises a natural basis for 
Intentlonallty* Be that as it nay, the coiBparlson between the Cartesian and 
Glbscnlan programs on the subject of Intentional objects (the goals of 
goal-directed activities) Is sharply drawn* Urxler the Cartesian progran, 
intentional objects are represented in an Internal medium; urxler the 
Glbsonlan program, intentional objects are lawfully specified by structured 
energy distributions -CTUrvey et al*t 1981)* 

3*1*3. Comma TKls aff Information the Indlcatlonal sense 

Any disquiet with the concepts of Internally encoded required values or 
Intentional objects as representations, eictends to the concept of 'commands** 
Is circular causality in general* and the» perception-action ring in partl^u* 
lar, mediated by commands? Although It has been a commonplace to say that the 
brain commands the body» this way of talking has been subject to little scru- 
tiny* As Reed (I98I) has observed, there is an entire theory of action 
wrapped up in the notion of central nervous system commands and much conceptu^ 
al effort will be required to unravel It* We will give some hints of what Is 
involved* To anticipate, Issuea raised In the preceding subsections will make 
a repeat appearance but in a subtl]; different form* 

The control of activity is founded on lnfornatlon» as both Bernstein and 
Gibson have sought to understand* ''Commands*' are a kind of Information that 
can be termed Indlcatlonal because their role Is to Indicate an action to be 
performed (Reed» 1981)» much as a stop sign on the highway indicates the ac^ 
tlon of arresting the forward motion of a car and a directional sign on the 
highway indicates which turn to take* Indlcatlonal Information Is Incomplete* 
To be cominded to stop one's car Is not to be told the details of how to do 
so* Obviously, the Informational basis for controlling activity is not 
exhausted by information tu tne Indlcatlonal sense* | To stop the car requires 
Information about when to begin decelerating and information about when the 
deceleration id sufficient and so forth* This sense of Information was^ dls* 



cus3ed in Section 2,2 and in the Immediately preceding section* Consonant 
with the terminology of these earlier sections^ we will refer to this sense of 
Information as specif Icatlonal * , The important point to be made is that an 
Indicated act cannot be performed without Information in the specif Icatlonal 
sense* On ^generalltlngt this point readst The indlcatlonal sense of Informa* * 
tlon Is always predicated on the speclflcatlonal sense gf Informat Ion * 

Holding this dependency In abeyance for the present » let us focus on the 
commonalities between coiiitiandd~as sources of information in t^ Indlcatlonal 
sense^^nd rules* Neither commands nor rules can determine an action, both 
commands and rules can be violated or Ignored, both commands and rules can 
enter into conflict (creating demands for impossible outcomes )» and tK>th 
couiands and rules require ^r* explicit act of comprehension for their func- 
tioning (Reed, 1981)* For these reasons^ a lawful determinate account of the 
control and coordination of activity cannot be founded on the notion of 
commands or information in the Indlcatlonal sense* A further undesirable 
feature is that ^he criticisms that apply to a body-states or sensation-based 
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theory of perception (see Section 2.0) apply to a conmand-^aded theory of ac-* 
tlon: There Id no rational explanation of the genesis of the^knowledge that 
forms and Intgvprets coomands* A conoend-4)a3ed theory of action looks like 
another unrepaylble loan of Intelligence. 

Ttie lawful basis of optical structure relevant to activity's control was 
labored in Section 9^,2 in order to tpake the notion of specification transpait*^ 
ent. Where Information in the Indlcatlonal ^ense la close to the concept of 
ntlet Information In the specif Icatlonal sense is cognate with law. Laws are 
deternlnatet non«-negoblable (they can never be violated or lenored)t harnonl-^ 
ous (they can never gl vet ride to lspo9Slbllltles)t and they^ do not depend on 
explicit knowledge for their functioning. In the cybernetlcal Interp r eta tlon » 
the rlng-prlnclple is mediated by indicators (commands). But it is apparent 
that thia ivwd not be so» for the same reason that medlatlion by referent slg^ 
nals need not be so* It is an unmedlatedt lawH>ased Interpretation of the 
ring principle (rather th^n a medlatedt rule-^ased Interpretation) that is the 
focua of the Glbsonlan program (see Section 3*2). A lawful account of the 
cmtrol and coordination of activity cannot be founded on Inforoutlon In the 
indlcatlonal aense but it could be founded on information In the speciflca-* 
tlonai sensor. 

3.1 .4 Sygbollc and dynamical mode5 



Ttie contrast of ^ndlcational Information and specif Icatlonal Information 
parallels that of discrete symbol strings and continuous dynamical processes 
ort equlvalentlyt rate^indep>ndent structures and rate-depMdent processes. 
These contrasts are . :iald Pattee (l973t 1977* l979) to Identify a 
Complementarity Principle that is the hallmark of living systems. Living sys- 
tems are seen to execute In two modest the symbolic and the dynamiCt which are 
Incompatible and irreducible* Consequently » understanding biological » physio-* 
logical and psychological phenomena is said by Pattee to rest with the 
elaboration of this complementarity. The coiiputatlonal/representatlonal ap-* 
proach to these phenomena that is ^ championed by the Cartesian program is 
flawed^ln Pattee *s vlew-^ecause It attempts to explain only through the dls-* 
Crete symbolic mode* Simllarlyt in his vlewt an approach that seeks to ex- 
plain such phenomena using only, (sic) the laws> of dynamics will also prove in-* 
adequate. By Pattee's reasonlngt both modes must be given full recognition; 
the phenomena in question are the result of the coordination between the two 
Biodes. Stated more sharplyt complementarity is advi^nced as a principle that 
calls for simultaneous use of formally Incompatible descriptive modes In the 
explanation of the characteristic phenomena of living systems (PatteSt 1982). 

There iSt howevert an asymmetry between the two modes that has to be 
appreciated. Mature uses the symbolic mode—no nholonomic (nonlntegrable) con-^ 
stralnts<»sparlngly. Dynamics are used to the fullestt wherever and whenevert 
to achieve characteristic biological effects* Symbol strings are usedt now 
and then, to direct dynamical processes and to keep down their coiq)lexlty*-ln 
other wordSt to trim the dynamical degrees of freedom* In Figure 5» which 
depicts a prototypical program formulation of actlvltyt the opposite strategy 
is at work. Very many nonholonomlc constraints are exploited to achieve (*to 
explain') the kinetic and kinematic regularities of e locometory step cycle* 
The question of how the dynamics-^roperly construed for the biological scale 
In terms of the conjunction of statistical mechanics and Irreversible 
thermodynamics *(Iberall» 1977; Prlgogine» 1980; Soodak & Iberall» 
l978)»mlght fashion the phenomenon is not addresnedt nor is the question of 
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how the symbol strings interface with the dynaiiics. Pattee's analysis Isx an 
Inportattt one for those students of oiovepent who would pursue the Cartesian 
progran With Its eaphasls on the synbollc node: Only In t^e working out of 
the physics of > regularity can one Identlfy^ the nature and type of synbol 
strings (nonho^ onottic constraints) needed to conplete the explanation. To be* 
xgln with the symbolic node* and to adhere strictly to lt» invites «n account 
that wlll< be plagued by arbitrariness (as» surely* is the account of ^ 
step^ycle represented by FlgMre-^). To begin with the dynamical mode^^and to^ 
pursue It earnestly* promises an account ^hat will be principled. 

'.'^lere ls» however* a deep pro^xem with Pattee's Coivlt ^entarlty Prlncl* 
pie. Js^r Patteet the discrete synbpl strings function as information In the 
Indlcatlonal sense. The proposed conplementarlty* therefore* is one of Indl* 
catlonal information and'^ynamlcs . The problem ulth endorsing a view of Indl* 
catlonal information and dynamics as formally Incompatible Is that It rules 
out any explanation of the origin of Indlcatlottal Information. He -and ott^ers 
have recorded our disquiet with the Complementarity Principle for Just this 
reason (CarellOt Turveyt Kugler^ & Shaw* 1982; Kugler et al.t 1982). One 
suspects that for the consistency of physical theoryt Informatlor In th€ Indl* 
catlonal sense should be lawfully derivable from dynamics (O'' Information In 
the specif Icatlonal sense). 

3.2 The Ring Princi ple < Physically Interpreted ) 

In this final section we provide an overview of the physical foundations 
of Bernstein's ring principle. Whereas the cybernetlcal Interpretation of the 
ring principle Is consistent with the Cartesian progran* the Interpretation 
evolving In physical biology is consistent with the Glbsonlan program. 

3.2. 1 Open systeqis and the role of causal dynamics 

AocoJ^dlng to' classical physics* living, systems are contlnu^sly strug* 
gllng against the laws of physics. Within the ladt few decades* however* it 
has bec<ime increasingly apparent that those physical systems that are open to 
the floW of enericy and matter into ^iiti out of their operational coot>onents be* 
have In a manner that suggests the behavior of living things and suggests a 
dramatically different view'of causal dynamica (see Yatest l982a* l982b* for a 
review). Whereas the behavior of an Isolated physical system is strictly de^ 
ter^ned by the system's initial and boundary condltionst systems open to the 
flow of energy and matter can evolve internal constraints that *free* the sys* 
tem's dynamics from its initial conditions. The arising of the new internal 
constraints serves to limit the trajectories of the Internal conponents* 
(thereby reducing the system's' Internal degrees of freedom* As these con* 
straints arise* new spatio/temporal orderlngs are created and the system ^de* 
rives new ways ^f doing business wltN its surroundings (that is* new ways of 
transacting ener^). 

While living systems can be viewed as following from the laws of physlcsr 
one distinguishing characteristic that emerges in systems of this order of 
complexity is the ability to time^elay energy flows Internally. This is sc* 
conpllshed through the maintenance of Internal potentials from which the sys* 
tem can periodically draw energy so as to produce a generalized external uork 
cycle. This self^ontalned source of potential energy (usually in chemical 
form) allcft*s the system to be characterized as self-sustaining. The ability 
to be self-sjjstainlng means that the system^s behavior is no longer governed 
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strictly by minimum energy trajectories or external work cycles defined on 
surrounding gradient fields. The possibility now arises that a self^ustalned 
system can teaporarlly depart from the constraints defined by the surrounding 
potential mlnlmuM. Departures from and «*eturns to minimum regions defined In 
the surrounding potential field require some form of sensitivity to the 
gradients; and thls> In tum» requires some form of self-^sustalning system. 
The ability to discriminate low order potential gradients selectively 
(Frohllch, 197^J Volkensteln & Chernavskll, i978) and the ability to form an 
autonomous! persistent s«lf ^austalnlng system (Iberall^ 1972» 1977) are funda- 
menial characteristics of living systems. 

3.2.2 Determinate and nondeterminate trajectories In partlcle/fleld systems 

Particle physics (classical^ quantum and relatlvlstlc ) studies the tra- 
jectories of particles to Infer the dispositions of potential fields.. The 
^sau^ptlon underlying the above strategy is that variations m the observed 
force field are strictljr a function of the particle's position In the field. 
The absve aaau^ptlon rests on two reoulrements: (1) that surrounding poten- 
tials remain constant (in both space and tlme)» and (11) that the particle haa 
no internal means, for Introducing or absorbing forces (which could contribute 
to a trajectory's de;»artlng from the minimum rations defined by the surround- 
ing potential field).. Particles satisfying th«e twj requirements have their 
trajectories completely determined by the form of the surrounding potential 
field: The minimum/regions Identify geometrical singularities In a topologl- 
ca^l field. The particle system is coiq>letely determined by and causally de- 
pendent on the topological form of the surrounding potential field. 

3.2.3 Self-sustaining systems and circular causality 

If> however* the particle system of Interest has an Internal mea^is for* 
generating and dissipating forces of a magnitude comparable to the fo^*ces of 
the surround*- that ls» the system is self-sustalnlng-^then the behavior of the 
particle need not be completely' determined by the topological form of the 
surrounding potential field. The particle has available Internal sources and 
sinks that can generate and absorb forces that» when ooiit»lned with the forces 
generated by the surrounding potential fleld» can yield ^equlU >rlum states 
that are not strictly defined by the singularities of the surrounding poten- 
tial field. The behavior of this class of pa.-t^cle system can be said to be 
nondeterminate wltn respect to its relationship with the surrounding potential 
field (cf. Kugler» Kelso» & Turvey» 19$2; Kggler» Turvey» & Shaw» 1S32). 
While the particle's equilibrium' states are no' longer detnrmlnately specified^ 
by the state of the surrounding potential /fields the particle is 3tlll» 
nonetheless* causally coupled to the. forces generated by the surrounding 
potential. That is to say» changes in the forces generated by the surrounding 
potential field will r^ulre compensatory changes in the particle's Internally 
generated forces if an equilibrium state ,is to be maintained invariant: The 
^ forces generated by the surround and the forces generated Internally ar^ 
causally linked in a circular causality with respect to invariant equilibrium 
states. 

The physical concept of olrculer causality (cf. Iberall» 1977) is meant 
to identify the lawful nature ot the coupling that links the surrounding 
potential field (and its associated force field) with the interior potential 
field (and its associated force field). Self-sustaining systems and their 
associated equilibrium sta^es are lawfully coupled to the surrounding poten* 
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tlal field through circular causality; they are systems whose Interior poten^ 
tlal fields play an active role In fashioning final equilibrium states. 

Self-sustaining particle systems are characterized by low energy cou- 
plings that relate the partlcle*s position to its surrounding field. The low 
energy coupling Is defined relative to the external work cycle generated by 
the particle. The coupling defines a ratio of the forces generated by the 
partlcle*s external work cycle in proportion to the forces generated by the 
surrounding potential flel^. A dlnenslonless number can be used to distin- 
guish the nature of the coupling qualitatively: 

Pi = (fprces generated b^^ the article's external work cycle) 
(forces generated by tbe urroundlng potential field) 

Pi < 1 ' high energy coupling 

Pi > 1 = low energy coupling 

A high energy coupling (Pi t) defines a coupling In which the partlcleV 
external work cycle is such that It is Insufficient to resldt the surrounding 
field *s potential gradients. If, however, an external work cycle is generated 
by the particle that resists the surrounding field *s potential gradients (Pi > 
Ut and contributes actively in the organization of equilibrium states, then 
the coupling can be considered to be of a low energy nature. The low energy 
coupling realized by a self-sustaining system forms a lawful basis from which 
a generalized theory of Inforoiatlon can be derived. 

3-2.4 Inforinatlon apd the ecological approach to perception and action 

Central to the Clbsonlan program is the cJalm that Information must refer 
to physical states of affairs that are specific and meaningful to the control 
and coordination requirements of activity (Turvey & Carello* 1981). 

Following Gibson (1950, 1966» i979) the above requlrenenta for Informa- 
tion are to be found In the <pialltatlve properties captured in the structured 
patterns of energy distributions coupling w animal to its environment (see 
Section 2.2). These patterns (1) carry, in their topological for«, properties 
that are specific to components of change and convonents of persistence in the 
anlmal-^nvlronment relation; (2) are meaningful (I.e., they define gradient 
Vdlues) with respect to the animal *s internal potentials; and (3) are lawful- 
ly determined by the environment and by the animal *s movements relative to the 
environment. According to the Glbsonlan program. Information Is a physical 
variable that defines a coupling that Is specific and meaningful with respect 
to the changing geometry of the econlche (defined by-^e animal /environment 
qua partlcle/fleld cystem totality). The energy pafxterns coupling the animal 
(ItTternal potential field) and environment (sucrjjyndlng potential field) are 
continuously scaled to the changing parameters and dimensionality ot the sys- 
tem (cf. Kugler et al.| 1980). The Informr^clon carried in the evolving geoioe^ 
try of structured energy distribution^ is information about animal dynamics 
(Internal potential layout) relative to environmental dynamics (surrounding 
potential layout). This concept of information is consistent with Thom*s view 
of information as geometric form (cf. Kugler et al., 198?): 

any geometric form whatsoever can be the carrier of Information* and 
in the set of geometric forms carrying information of the same type 
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the topological complexity of the form la the qualitative scalar of 
the Information. (Thorn, 1975, p. 145) 

Inf^ornatlon aa a geometry of form (defined over potential flelda) arlaea aa an 
a posteriori fact of the ayatem. The information can be carried In the forir 
of geoeoetrlc manifolds that are created, sustained, and dissolved i^lthln a 
large variety of physical flow fields. The flow fields can be assembled out 
of mechanical, chemical, or electro-magnetic constraints. 

3.2.5 On t ti3 determinate nature pf, Information and )^, h? non*determlnate nature 
of behavior 

The goal of physics for the twentieth century has been to understand the 
nature of the energy states exhibited by particles at all scales of magnitude. 
The foundation of physics rests on the connitaent (explicit ot* impljlclt) to 
natural laws, that is, the commitment to a natural continuity in energy states 
reducible to symmetry statements (equations) defined on conservations. The 
strategy for defining natural laws rests on the Identification of trajectories 
' assuned by particles. While this strategy, has valid application for simple 

particle systems (non^oelf-sustalnlng systems). Its application toward 
explicating the natural laws governing the energy states of self sustaining 
systems must be questioned seriously. The behavior of & self-sustaining sys- 
tem is jf iot strictly determined by the energy states of the surrounding poten- 
tial fields. As noted, the energy states of the Internal potential flel^ play ^ 
an active role in the det%rmlnatlon of the observed trajectory. While the be- 
havior (observed trajectory) of 3 self-sustaining system has a nondetermlnate 
status, the Informational states defining the low energy coupling that relates 
the surrounding potential f leXd to the Internal potential field has a 
determinate status. Ihe Information states are Invariant (i.e., stable and 
reproducible) In the strictest sense of lawful determinism. A physical analy- 
sis of the behavior (observed trajectories) of self-sustaining systems must 
entail an Inquiry Into the low energy Informational states that lawfully cou- 
ple an animal (complex self-sustaining particle) to Its environment (su round- 
ing potential field). (For an example of a physical analysis of the role of 
low energy couplings, see Kugler, Turvey, Carello, & Shaw, ifi press). It can 
be argued that the goal a physics befitting Bernstein's physiology of 
activity is that of Identifying ^he laws ttiat create, sustain and dissolve low 
er^er^y Informational states. 
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HAPPING SPEECH; MORE ANALYSIS, LESS SYNTHESIS, PLEASE* 



Michael Studdert*4Cennedy^ 



Stimulation mapping would be of little interest* if its achievements were 
merely to assign brain loci to categories of linguistic or psychological 
description. Oitr understanding of complex, intermodal functions, such as nam^ 
ing or reading, would be blocked rather than advanced, if we were to conclude, 
as Ojemann speculates, that each is a macrocolunn or module, an impentstrable, 
vitreous chip in the great iDOsaic of language. The promise of stimulation 
mapping is rather that it may reveal, by some pattern of association and 
dissociation* t^e simpler mechanisms from which a function emerges and, ulti* 
mately, its underlying neural circuitry. To fulfill this promise, stimulation 
studies should not adopt uncritically the familiar, nonanalytic, modali^ 
ty^based tests of aphakia assessment. Ilather» these tests or others must be 
given a functional analysis in terms of clear-cut psycholin^istic hypotheses. 
Naming, for example, is not a unitary function: nandng errors may reflect 
perceptual, semantic,, or phonological deficits (Goodglass, 1980), and, the 
source of naming errors may often be inferred from their form (e.g.» KBtz, 
1982). Similarly, deficits in oral reaoing are open to increasingly 
sophisticated analysis in terms of phonological segmentation, lexical access, 
and phonetic execution (e.g. , Liber man, Liberman, Mattingly, & Shankweiler, 
1980). Unfortunately, little in the target paper suggests that systematic 
analysis of this kind was attempted. 

Ojemann did, however, test one ^lower^ function that might plausibly be 
expected to enter into a pattern of relations with several others, namely* 
orofacial mimicry. If the ability to produce simple movements of the mouth is 
impaired, one wo^jld ;iot te surprised if the ability to speak, in naming or 
reading, or even to recall a word (if shortHerm storage engages a motor 
representation), were also impaired. In fact> very much these relations were 
observed (though not with perfect consistency). Yet interpretation of even 
this modest pattern of associations is hazardous. 

Before turning to this, consider how we might interpret the roost contro^ 
versial link with inpaired mimicry: impaired phoneme identification. The 
key, at least to the frontal lobe sites, is hinted at in the sp99&ely reported 
findings of Darwin, Taylor, and Hilner (Ettlinger, Teuber, & Hilner» 1975, 



^osmentary on Ojemann, G. A. Brain organization for language from the per-^ 
spective of electrical stimulation mapping. T he Behavioral and Br ain Sci* 
ences , 1983, 6, 218*219. 

^Also Queens College and Graduate Center, City University of New York. 
Acknowledgment . Hy thanks to Virginia Hann for advice. Preparation of this 
conmient was supported in part by NICHD Grant No. HD-01994 to Haskins Labora* 
tories. 
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p. 132)» cited by OJemann* These authors discovered that, patients in whom the 
left facial regions had been excised, for relief of epilepsy, were impaired 
both in spelling to dictation and in the same phoneme identification task as 
OJemann used. Since these patients could understand and talk normallyi Darwin 
and his colleagues concluded that their difficulty was confined to tasks 
stressing the phonetic structure of speech sounds. This conclusion implies 
that the identification of phonemes in nonsense words may be essenti&lly a 
nonlinguistic task (inasmuch as it bypasses the lexical and syntactic pr'oces* 
ses of normal speech perception)* a task very close, in fact, to odmlcry. 
Ojemann stresses that stimulation during the phoneme identification test oc- 
curred only Airing presentation of the stiimilus. However^ since the patients 
had simply to reproduce wHat they heard» their task reduced to finding* during 
presentation , the motor pattern speoified by the stimulus. Moreover, if, as 
OJemann suggests, motor sequence programs were stored in the temporoparietal 
region* this account would also handle the temporoparietal links between 
I^oneme identification (accomplished by execution of a two-syllable nonsense 
word) ajid three-gesture mimicry* 

We may note, incidentally, that all phoneme identification tasks, calling 
for metalinguistic judgments* may reveal more about structural correspondences 
between audition and articulation than about normal processes of speech 
perception. Accordingly, even if a motor-reference theory of speech percep- 
tion were still viable, OJemann*s findings would have little bearing on it. 
In fact* the link between perception and production is probably deeper and 
less tortuous than the old motor theory proposed. As OJemann himself hints* 
the link is clearest in language acquisition, wheri the child learns to speak 
by discovering the articulatory dynamics specified by the speech it hears* 

I come next to tLe pattern of associations and dissociations between 
functions* rather confusingly charted for a aeven-«ubject series in OJemann*s 
; igurwa 3 and 4* Apparently, the circles of Figure 3 correspond to the large 
circles of Figure There are 52 large circles (25 frontal* 9 parietal* 18 
temporal), representing sites where all five language functions were tested* 
In addition* Figure <i displays (by my count) 27 small circles (4 frontal, 10 
parietal* 13 temporal), representing sites whe*"e all functions* except orofa- 
cial mimicry, were tested* This brings us to a total of 79 sites* Of these, 
15 seem to have yielded no result* leaving us with 64 effective sites (24 
frontal, 18 parietal, 22 temporal)* an average of about 9 per patient. ^ By 
arduous tabulation, we can discover the links between functions impaired at 
each site— though not, unfortunately* how these links were distributed across 
subjects. 

Among my findings was the fact, touched on by OJemann in his discussion 
of discrete localization, that every function (except, interestingly, mimicry) 
is disturbed alone on at least one site* a dissociation that effectively 
demonstrates the absence of causal relations among thett^rnctions in at least 
those patients who display it* But what is most remarkablo is that every 
function (except single^gesture mimicry, confined to the frontal lobe) is im- 
paired in every area* Thus, short-term memory (STM) is isg^aired at 
frontal, 11/18 parietal* 7/22 temporal sites; reading and/or naming are im- 
paired at 13/2^ frontal, 9/18 parietal* 11/22 temporal sites; three-gesture 
mimicry is impaired at 4/23 frontal, 5/6 parietal* 5/13 tenf)oral sites; 
phoneme identification is impaired at 9/24 frontal, 5/l8 parietal* and 6/22 
teoq^oral sites* How are we to square this distribution with OJemann *s model. 
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aaalgnlng retrieval to the frontal, storage to the temporo-parletal lobes? 
Koreover, OJemann reports that 3TK was tested with stimulation at the time of 
input, storage, or retrieval, but these distinctions are not preserved in the 
report o^ the data. Were all frontal 3TK deficits confined to retrieval, all 
temporo**parietal deficits to storage? 

The problem worsens as soon we leave these statistical patterns and 
consider iitdividual subjects-^ reasonable move if we are interested in mech- 
anism. OJemann acknowledges a high degree of individual variability, but as- 
sures us that ^the interrelationships described in the model can be readily 
identified in individual patients* such as the one illustrated in Figure 2.^ 
Why, then* were we not given a conqparable figure (or at least a table) laying 
out the pattern of relations for each subject? Yet, even if we had the 
individual data, we would have to be cautious in interpretation. If two func^ 
tions are dissociated* we can be confident that there is no necessary connect 
tion between them. However* even if they are regularly associated* we cannot 
infer a necessary connection. This is so because electrodes are large rela* 
tive to nervous tissue, so that we cannot be surf! that the association is not 
due to blocking of distinct, though closely neighboring* functions. These 
limitations are inherent in the stinwlatlon-mapping technique in its present 
sta^e of refinement. 

On the other hand, as OJemann argues, the fact that certain associations 
do recur over wide areas of peri-Syivian cortex is encouragement enough for 
continued research. OJemann is to be honored for rediscovering a valuable 
technique* the most precise that we have to analyze the neural circuitry of 
functioning human cortex, and for leading the way toward irqport? new 
discoveries. 
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BOOK REVIEW* 



R* E« Asher and Eugenie J* A« Henderson (Eds*). (1981)* Towards a history of 
phonetics (pp* l-3l7)* Scotland: Edinburgh University Press* 

Leigh Llsker^ 



The editors of this volume* a collection of eighteen papers on various 
topics In the history of phonetic Ideas and of writing systems, had two pur« 
poses In undertaking their enterprise: to do honor to David Abercronbt^ cn 
the related occasions of his seventieth birthday and retirement from the chair 
of phonetics at the University of Edinburgh* and to make a beginning toward a 
systematic history of phonetics, one of the enduring Interests of the distin- 
guished scholar being saluted* As the editors not.', this book Is nothing like 
a complete or Integrated history of the field, or even a first approximation 
to one; with nineteen authors from six countries and the Inevitable diversity 
that multiple and Independent authorship entails, this was not to be expected* 
It does, the editors say, Include discussion of those matters that should be 
Important components of any proper history* The papers are arranged Into six 
'Mparts,'* three of which deal with the development of phonetic Ideas (basic 
concepts , processes , voice quality, and voice dynamics ) , one on "^national 
contributions,^ one on the achievements of Individual scholars, and one on 
writing systems* 

The contributions Included In the first three parts of the book range 
over the following topics: feature classification (V* A* Fronkln and 
P* Ladefoged), the phonetlcs^honology distinction of Kruszewskl and the 
Kazan' School (K* K* Albow), the artlculatory *ersu3 acoustlc*audltory 
description of vowels (J* C* Catford^v^n^tem traditions In the description 
of nasals (J* A* Kemp), early experimental studies of coartlculatlon 
(W* J« Kardcastle), consonantal rounding In British English (G* Brown), and 
the auditory analysis of voice quality (J* Laver) and prosody (K* Sumera)* Of 
these papers, all but three are of Interest largely as Intended, that Is, as 
history* The essays by Cat ford , Kardcastle * and Brown , wh historical^ 
ly-orlented, might also engage the attention of an ahlstorlcally-mlnded con^ 
temporary speech researcher* Catford makes a spirited yet Judicious defense 
of the traditional vowel-height- model of vowel classification developed by 
A* M* Bell and elaborated by Henry Sweet and Daniel Jones* Hardcastle's paper 
reminds us that the phenomenon of coartlculatlon, currently very much under 
scrutiny. Is by no means a recent discovery, and that the view of speech as a 



•Also Language - In Society , 1983, j2, 369-371* 
+Also University of Pennsylvania* 
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sequence of static positions or sounds linked by glides has been ^explicitly 
recognized as fallacious for at least a century. Brown discusses the rela* 
tlvely neglected feature of lip rou'^dlng as a property of British English con* 
sonants^ casts doubt on historical Inferences bas^d on the absence of refer- 
ence to It In earlier descriptions of the language* and suggests that the 
so-called rounded vowels are les^ reliably marked by rounding than are certain 
of the consonants* 

Two papers deal with the contributions^ Individual phoneticians. One ^ 
Is by R. Thelwell» who describes the career of a relative* John Theiwell» a 
speech therapist and lecturer on **elocutlonary science** In London during the 
late eighteenth and early nineteenth centuries* The other Is a brief autoblo* 
graphical sketch by K* L. Pike that describes the path he followed In his 
development as a phonetician; naturally enough^ It does not begin to do jus- 
tice to the contributions that make him the most accomplished phonetician 
among the American llnst^ists of this and the last four decades. 

W* S* Allen* who contributed significantly to our historical perspective 
with his study of the phonetics of ancient Indla» here gives an account of the 
phonetic thout^^t x*t the ancient Greeks* and presents evidence to support his 
view that in (phonetic analysis they were not up to the Indians*' even If they 
elaborated an alphabet that he judges to be more useful a tool for phonetic 
analysis th^n the Oevanagarl. Two papers^ by H. A. K. HaUlday and by 
N-C. 7. Chang* are densely packed with Information on the development of 
phonetic and phonological thought in China* Both papers are rich In leads for 
those Interested In pursuing the relation between phonetic theory and^orthog* 
raphy in the Chinese context* ^developments In phonetics In nlneteenth^century 
Germany are* of cor *se» much nearer home (hardly ai^y more **natlonal** than 
those In Britain)* and K* Kohler goes well beyond a historical accounting to 
conduct a forceful polemic against what he deplores as the unfortunate separa* 
tlon of **lingulstlc phonetics** from phonetics In Its physical* physiological^ 
and psychological aspect?, a development for which he holds Slevers mainly 
responsible. He argues vigorously against the separation of phonology from 
phonetics* which he terms an **omlnous schism*** and Instead* champions the Idea 
of an Independent discipline of **speech science** freed of any **unfortunate** 
dependence on .linguistics and open to the disciplines that have something' 
to contribute to the study of speech in all Its aspects. 

The last four papers are concerned with writing systems that car.' be said 
to represent* more or less imperfectly* the phonetic i<ropertles of speech* 
Two of them are devoted to the history of effort* to fit systems borrowed from 
oi:e linguistic setting to another rather different one* A revision of a study 
by Abercromble himself describes attempts over the past four centuries to 
repair the perceived deficiencies of the Latin (or Roman) alphabet as a vehl* 
cle for English* while J. Haw deals with the refashioning by scholars and oth- 
ers of the Arabic and Latin orthographies as devices for representing Swahlll. 
Both papers go Into detail In recording the many attepts at script **reform*** 
but we can only infer from the extent to which the various changes proposed 
gained general acceptance Just how widespread was dissatisfaction with the 
orthographical status quo. A paper by J. Kelly and one by H. K. C. MacHahon 
recount the history of the development of the various *^shorthands** %nv<>nted 
and promoted by a number of phoneticians durina the late eighteenth and early 
nineteenth centuries: A. J. Ellis* I* Pitman* A. H. Bell* and Henry^ Sweet. 
These systems were* for the most part* intended to serve as auxiliaries to the 
standard orthography* for scientific and secretarial purposes.* The modern 
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reader may be surprised to learn hc^r large a role phoneticians played in a 
matter that aroused a good deal of public interest and contention at the time; 
the recent popularity of Shaw*s Pygmalion brought about no revival of interest 
in the means by which Professor Higgins captured on paper the details of 
Liza*3 speech patterns* 

A bibliography of David Abercrombie^s published works» prepared by ^ 
Elizabeth Uldall» a list of the names of p« sons and institutions whose 
subscriptions aided its publication^ and indices of personal names and sub* 
Jects complete the book. The editors and publisher are to be congratulated 
for the aesthetically pleasing format and remarkably error*free printing of an 
extremely demanding text* 

Towards a history of phonetics succeeds in ntaking a case for the serious 
study of the development of ideas about the nature of speech at an important 
aspect of the history of linguistics* Not that anyone would contend that the 
history of phonetic thought is unworthy of scholarly attention^ but the 
contributors to this volume demonstrate that a wealth of readily accessible 
materials awaits the historian who can organize into a lucid picture men*3 
opinions^ different over tti»e and place and cultures^ regarding the nature of 
that uniquely human product » the act of speech* 
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